New build - can't boot properly and database locked

Status
Not open for further replies.

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
So i've testet FreeNAS in VMware and decided i wanted the real deal with it :)

I bought new hardware.
Asus M5A88-M (BIOS 1101)
AMD FX-4100
2x4GB Kingston ECC
Sandisk 4GB USB drive

Disks i have from before. 6 Seagate 7200.9 500GB

The motherboard is on the recommended list, so i thought this would be easy.
I disabled USB 3 in BIOS.

I downloaded the FreeNAS-8.2.0-BETA3-x64.img.xz file and extracted the FreeNAS-8.2.0-BETA3-x64.img.
Then i used Win32 image writer to write it to my USB stick.

Plugged it in and boot.

Boot takes FOREVER!!! Hangs at "geometry does not match label (16h,63s != 255h,63s)"

And when it finally booots and i log into the GUI... it wont even create a volume. Database locked error.

It is even recommended to boot from USB. And still these issues? wtf... :(
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
I'm trying... :|

Boots just fine without any disks connected.
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
Oh dear... after some trying and failing... i replaced one drive.

BIOS finds the drives faster. FreeNAS boots normal. No delays. It's a lot more "rattle" from the drives now.

FreeNAS does not seem to enjoy AHCI?


I have created a volume and a dataset.

CIFS share up and running.

Transfer speed write: 10-20MB/s. Normal?
 

bcrowder

Dabbler
Joined
Jan 6, 2012
Messages
36
You can do better, I have a raidZ2 array of 5 new 1.5tb drives, and another raidZ1 of drives that I reused from my last box. The new array gets 36 to 55 writes and about 80+ reads. On the array of 3 1tb drives I had one drive reporting smart errors and was only getting 20 write and 35 read, I replaced it and it's up to the same speed as the other. Run the smart tests on the other drives and read through the forum for other advise. Reason I say this is, drives shouldn't "rattle". Just went thru this with mine. :)

Thanks,
Bill
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
You can do better, I have a raidZ2 array of 5 new 1.5tb drives, and another raidZ1 of drives that I reused from my last box. The new array gets 36 to 55 writes and about 80+ reads. On the array of 3 1tb drives I had one drive reporting smart errors and was only getting 20 write and 35 read, I replaced it and it's up to the same speed as the other. Run the smart tests on the other drives and read through the forum for other advise. Reason I say this is, drives shouldn't "rattle". Just went thru this with mine. :)

Thanks,
Bill

Thanx for the replys :)

By "ratle" i mean i hear them work. Normal sounds that is :tongue:

Bit tired yesterday... had to set the USB drive as boot device after changing to ahci... and now it boots fine, and i'm currently testing my system for stability.

Speed went up too. But not as much as i hoped for. Write speed began at 60MB/s (windows reporting transfer speed over ethernet) but went down to about 22MB/s.
Read utilizes about 90% of my 1Gbit link.

Read is fine then over ethernet... but it wuld be nice with better write performance though.

Scrub seems fine too. about 5 minutes scrubbing 60GB over my 6 drives? :)
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Speed went up too. But not as much as i hoped for. Write speed began at 60MB/s (windows reporting transfer speed over ethernet) but went down to about 22MB/s.
Read utilizes about 90% of my 1Gbit link.
Read through this [thread=5338]thread[/thread]. There's another one with some different settings, possibly, you can search for.

Scrub seems fine too. about 5 minutes scrubbing 60GB over my 6 drives? :)
FYI, 60GB is nothing with your # of drives & space. It should be quick.:)
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
Thanx will check it out!

Testet write performance with this from another thread:

dd if=/dev/zero of=tmp.dat bs=2048k count=50k (i used 5000)

and my result is (46727289 bytes/sec)

Thats like... 46MB/s. Also... i hear the drives work a little... then its silent for a while... then it writes a little again. And so on. I'm thinking i have a disk issue.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
dd if=/dev/zero of=tmp.dat bs=2048k count=50k (i used 5000)

and my result is (46727289 bytes/sec)

Thats like... 46MB/s. Also... i hear the drives work a little... then its silent for a while... then it writes a little again. And so on. I'm thinking i have a disk issue.
Do me a favor and rerun the test like everyone else.
Code:
dd if=/dev/zero of=/mnt/vol01/tmp.dat bs=2048k count=50k
dd if=/mnt/vol01/tmp.dat of=/dev/null bs=2048k count=50k
It's supposed to be the exact same test or you should be posting in a separate thread.


Also, while you are running the test, properly this time, run the following in a separate SSH session:
Code:
gstat
You want to be on the lookout for one drive that has significantly higher latency than the rest or else a large amount of i/o's queued up in comparison to the rest of the disks. You can also check /var/log/messages to see if there are any timeout errors.

Have you run the long smart tests on your drives yet? If not do so after your testing.
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
It's in progress!

Running gstat and as far as i can tell all drives have almost identical values across the floor.


Testet with seatools yesterday and all drives passed the short test. The long (seatools) test took 5 hours or so... so i skipped that one.


Thanx for the help! :)

PS: Looking at gstat all drives work with same speed, 99-100% busy for a while... then everything droppes/pauses a second or two before continuing.

Updated - Results:

[root@BATNAS] ~# dd if=/dev/zero of=/mnt/vol01/tmp.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 2511.947579 secs (42745391 bytes/sec)

[root@BATNAS] ~# dd if=/mnt/vol01/tmp.dat of=/dev/null bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 459.018764 secs (233921118 bytes/sec)
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
It's in progress!
Perfect.:)

Testet with seatools yesterday and all drives passed the short test. The long (seatools) test took 5 hours or so... so i skipped that one.
From an SSH session you can run for each drive:
Code:
smartctl -t long /dev/adaX
Or whatever your drives are listed as. For 500GB drives it will probably be 1-2 hours. For which time you will want to try to avoid using them.


PS: Looking at gstat all drives work with same speed, 99-100% busy for a while... then everything droppes/pauses a second or two before continuing.
This may be ZFS breathing/write-stalls people have experienced.

107374182400 bytes transferred in 2511.947579 secs (42745391 bytes/sec)
Your writes are quite poor.:( How is your pool setup?
Code:
zpool status -v
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
[root@NAS01] ~# zpool status -v
pool: vol01
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
vol01 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
ada0p2 ONLINE 0 0 0
ada1p2 ONLINE 0 0 0
ada2p2 ONLINE 0 0 0
ada3p2 ONLINE 0 0 0
ada4p2 ONLINE 0 0 0
ada5p2 ONLINE 0 0 0

errors: No known data errors


Thanx for the smart command... was looking for that :)

Can you do smart check on all drives at the same time?
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Can you do smart check on all drives at the same time?
Yes. The smart check is internal to the drives and nothing is transferred to/from the sata ports. For your drives:
Code:
smartctl -t long /dev/ada0
smartctl -t long /dev/ada1
smartctl -t long /dev/ada2
smartctl -t long /dev/ada3
smartctl -t long /dev/ada4
smartctl -t long /dev/ada5
Just press enter one after the other.


You have 512b drives right? Either way what's the output of:
Code:
zdb | grep ashift
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
Great, Learning a lot here :) All drives are under testing now...

I beleive they are 512b (sector?) Cuz they are some years old... Seagate 7200.9

[root@NAS01] ~# zdb | grep ashift
ashift=9
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
How do i know the selftest is finished? Oh.. nevermind.. i found it :)

One drve also provide the following:

[root@NAS01] ~# smartctl -H /dev/ada4
smartctl 5.42 2011-10-20 r3458 [FreeBSD 8.2-RELEASE-p6 amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Please note the following marginal Attributes:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
190 Airflow_Temperature_Cel 0x0022 055 045 045 Old_age Always In_the_past 45 (Min/Max 42/51)

Old_age??? How old is that? :tongue:
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
Smart outputs:

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 253 006 Pre-fail Always - 0
3 Spin_Up_Time 0x0002 088 086 000 Old_age Always - 0
4 Start_Stop_Count 0x0033 100 100 020 Pre-fail Always - 22
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 086 060 030 Pre-fail Always - 454466371
9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 16324
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0033 100 100 020 Pre-fail Always - 27
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 058 049 045 Old_age Always - 42 (Min/Max 42/42)
194 Temperature_Celsius 0x0022 042 051 000 Old_age Always - 42 (0 21 0 0 0)
195 Hardware_ECC_Recovered 0x001a 074 054 000 Old_age Always - 40299404
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0




SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 253 006 Pre-fail Always - 0
3 Spin_Up_Time 0x0002 088 086 000 Old_age Always - 0
4 Start_Stop_Count 0x0033 100 100 020 Pre-fail Always - 21
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 086 060 030 Pre-fail Always - 451829436
9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 16324
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0033 100 100 020 Pre-fail Always - 27
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 059 049 045 Old_age Always - 41 (Min/Max 40/41)
194 Temperature_Celsius 0x0022 041 051 000 Old_age Always - 41 (0 20 0 0 0)
195 Hardware_ECC_Recovered 0x001a 073 054 000 Old_age Always - 168241216
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0




SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 253 006 Pre-fail Always - 0
3 Spin_Up_Time 0x0002 088 086 000 Old_age Always - 0
4 Start_Stop_Count 0x0033 100 100 020 Pre-fail Always - 19
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 086 060 030 Pre-fail Always - 449507730
9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 16324
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0033 100 100 020 Pre-fail Always - 26
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 058 049 045 Old_age Always - 42 (Min/Max 41/42)
194 Temperature_Celsius 0x0022 042 051 000 Old_age Always - 42 (0 21 0 0 0)
195 Hardware_ECC_Recovered 0x001a 075 054 000 Old_age Always - 166516008
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0



SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 253 006 Pre-fail Always - 0
3 Spin_Up_Time 0x0002 087 086 000 Old_age Always - 0
4 Start_Stop_Count 0x0033 100 100 020 Pre-fail Always - 18
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 086 060 030 Pre-fail Always - 448233621
9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 16324
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0033 100 100 020 Pre-fail Always - 25
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 056 045 045 Old_age Always In_the_past 44 (Min/Max 43/44)
194 Temperature_Celsius 0x0022 044 055 000 Old_age Always - 44 (0 21 0 0 0)
195 Hardware_ECC_Recovered 0x001a 067 054 000 Old_age Always - 44585485
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0



ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 253 006 Pre-fail Always - 0
3 Spin_Up_Time 0x0002 087 086 000 Old_age Always - 0
4 Start_Stop_Count 0x0033 100 100 020 Pre-fail Always - 19
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 073 060 030 Pre-fail Always - 22344503
9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 16319
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0033 100 100 020 Pre-fail Always - 27
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 058 046 045 Old_age Always - 42 (Min/Max 41/42)
194 Temperature_Celsius 0x0022 042 054 000 Old_age Always - 42 (0 21 0 0 0)
195 Hardware_ECC_Recovered 0x001a 072 055 000 Old_age Always - 40333273
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 Data_Address_Mark_Errs 0x0032 100 253 000 Old_age Always - 0
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
190 Airflow_Temperature_Cel 0x0022 056 045 045 Old_age Always In_the_past 44 (Min/Max 43/44)
194 Temperature_Celsius 0x0022 044 055 000 Old_age Always - 44 (0 21 0 0 0)
Ouch, that's hot. Particularly for those older disks. Get some direct airflow on them. I would try and keep the temperature < 40C.
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
Crap! I'm beginning to think this is all the speed i get from these disks. :(

Read speed is about what i shoud expect reading old reviews. Sequencial read tops out around 65MB/s and bottoms around 25MB/s.
Firingsquad has a review with write speeds. Should top around 50MB/s.

The write is a bit more stable now after a couple of tunables... at least i think. Also i replaced one drive that i suspect were a tad slower than the others.

Anyway, gstat shows all drives writing pretty stable at about 12000kBps. Thats around 12MB/s per drive. And with 6 drives (2 parity drives) you have 4x 12MB/s as theoretical max = 46MB/s.

Now... i deleted my array and created a new one. Single drive. UFS.
And guess what? I get the same write speed (per disk) on this array. 12200kBps.

So is it a throttled speed or is it just the drives beeing to darn sucky?
 

Sir.Robin

Guru
Joined
Apr 14, 2012
Messages
554
Striping all drives with UFS only give me half the write speed per drive. 6500kBps approx.

Striping all drives with ZFS gives me approx 12000kBps per drive as with raidz2.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Personally, I wouldn't be putting those drives under load with those higher temps. I would keep them cooler unless you don't mind them start dieing off.

Delete your array and test the drives individually. Come to think of it are your drives SATA or PATA? They were available in both if I'm not mistaken. How are they connected direct to the motherboard or add-in card?

After you delete your array you can run:

This command will write to the hard drive and will delete data. Only run it on disks not part of an array.
Code:
dd if=/dev/zero of=/dev/ada0 bs=2048k count=50k

dd if=/dev/ada0 of=/dev/null bs=2048k count=50k
That will essential give you your raw performance bound by either the disks themselves or what they are connected through. If you still think you might have a disk that's pulling down the rest of your pool then run the above command for each disk waiting until the previous test finishes.

Then you should run it on all disks all at once. Ideally the performance should be the same as a single disk.
 
Status
Not open for further replies.
Top