Shutdown problem with ZIL in mirror

Status
Not open for further replies.

gregober

Dabbler
Joined
Sep 30, 2012
Messages
17
Hello,


I am having trouble shutting down my FreeNAS server.
It is quite a big unit equipped with :

•*2 x Intel SSD for ZIL
•*6 x NAS Raid 3To from WD
•*1 x USB dongle on chip (directly attached to the mother board

Mother board is an Intel board with 12 disks.

SAS disks are connected to SATA port directly on the mother board
The 6 disks are connected to the Adaptec controler.


Operation has been going on quite ok…*but when I shut down the unit, at the real end of the boot, system hangs…*
Screen messages states nothing particular :

Code:
Syncing disks, vnodes remaining…*0 0 0 0 done 
All buffers synced. 
Uptime: ***x 


And that's it…*nothing more the unit stays here and does not want to move !!

I have done and it seems that the hang is caused by the ZIL which is not correctly unmounted. Because when I remove the pool and simply configure It as a ZFS (with no ZIL on the SSD), then the system can reboot ok !

Have I missed something, what have I done wrong ?
Is this a known bug ?


Thanks.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I really cannot help you but it would be nice if you posted the version of FreeNAS you are using and what version your pool is up to (V15 or V28).

My only thought is how is the ZIL physically connected to your machine. Just some thoughts, but trust me, I'm not the expert on ZIL's... Since you can apparently remove your ZIL, I think you are running V28. If so could you just create a single SSD ZIL and test that out. I know you wouldn't want to do that with V15 at all due to potential pool corruption issues.
 

gregober

Dabbler
Joined
Sep 30, 2012
Messages
17
I really cannot help you but it would be nice if you posted the version of FreeNAS you are using and what version your pool is up to (V15 or V28).

My only thought is how is the ZIL physically connected to your machine. Just some thoughts, but trust me, I'm not the expert on ZIL's... Since you can apparently remove your ZIL, I think you are running V28. If so could you just create a single SSD ZIL and test that out. I know you wouldn't want to do that with V15 at all due to potential pool corruption issues.

Yes, I am using v.28 and latest stable release 8.3.
To be more precise :

FreeNAS-8.3.0-RELEASE-p1-x64 (r12825)
Platform Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
Memory 32711MB


I don't even know if with such a large amount of memory, It is necessary to have a ZIL on another disk ?

Anyway, this is how It was planned.


System seems stable once booted, it is only on reboot…*But this is potentially a problem that I'd like to avoid as much a possible.
I would need to have access to the Remote KVM every time I need to reboot the device. This not really something anyone would like.


I'll try this with a single disk for the ZIL and let you know how It went.


Thanks.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
From what I know (which isn't a lot) the ZIL is only to increase a committed write response to the user and then the write will be committed to the hard drive. This lowers the latency and apparent speed of writing to the end user and has nothing to do with writing the data to your hard drives any faster. The RAM in your system makes no difference in this aspect but the more RAM you have then the more data can be collected and then it still needs to go into the ZIL from there (as I understand it). If you have a lot of write operations then a ZIL may benefit you, or if you need extremely low latency. You RAM will help out on read operations where often read files are cached in the RAM (L-ARC).
 

gregober

Dabbler
Joined
Sep 30, 2012
Messages
17
I really cannot help you but it would be nice if you posted the version of FreeNAS you are using and what version your pool is up to (V15 or V28).

My only thought is how is the ZIL physically connected to your machine. Just some thoughts, but trust me, I'm not the expert on ZIL's... Since you can apparently remove your ZIL, I think you are running V28. If so could you just create a single SSD ZIL and test that out. I know you wouldn't want to do that with V15 at all due to potential pool corruption issues.


Using 1 disk instead of two for the ZIL does not change anything at all. Still hangs at the same point.


Here is the output of the DMESG for what It's worth.


Code:
[root@freenas] ~# dmesg 
Copyright (c) 1992-2012 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.3-RELEASE-p5 #2 r244158M: Wed Dec 12 10:04:42 PST 2012
    root@build.ixsystems.com:/home/jpaetzel/8.3.0/os-base/amd64/usr/home/jpaetzel/8.3.0/FreeBSD/src/sys/FREENAS.amd64 amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz (2394.25-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x206d7  Family = 6  Model = 2d  Stepping = 7
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x17bee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,AVX>
  AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant
real memory  = 34359738368 (32768 MB)
avail memory = 33071357952 (31539 MB)
ACPI APIC Table: <INTEL  S2600GZ>
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  2
 cpu2 (AP): APIC ID:  4
 cpu3 (AP): APIC ID:  6
WARNING: VIMAGE (virtualized network stack) is a highly experimental feature.
ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 (20101013/tbfadt-707)
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
kbd1 at kbdmux0
hpt27xx: RocketRAID 27xx controller driver v1.0 (Dec 12 2012 10:04:31)
cryptosoft0: <software crypto> on motherboard
aesni0: <AES-CBC,AES-XTS> on motherboard
acpi0: <INTEL S2600GZ> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, 9d000 (3) failed
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
cpu2: <ACPI CPU> on acpi0
cpu3: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 47 at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> irq 47 at device 1.1 on pci0
pci2: <ACPI PCI bus> on pcib2
igb0: <Intel(R) PRO/1000 Network Connection version - 2.3.1> port 0x1060-0x107f mem 0xd2160000-0xd217ffff,0xd21b0000-0xd21b3fff irq 27 at device 0.0 on pci2
igb0: Using MSIX interrupts with 5 vectors
igb0: Ethernet address: 00:1e:67:54:9f:cd
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb1: <Intel(R) PRO/1000 Network Connection version - 2.3.1> port 0x1040-0x105f mem 0xd2140000-0xd215ffff,0xd21a0000-0xd21a3fff irq 30 at device 0.1 on pci2
igb1: Using MSIX interrupts with 5 vectors
igb1: Ethernet address: 00:1e:67:54:9f:ce
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb2: <Intel(R) PRO/1000 Network Connection version - 2.3.1> port 0x1020-0x103f mem 0xd2120000-0xd213ffff,0xd2190000-0xd2193fff irq 28 at device 0.2 on pci2
igb2: Using MSIX interrupts with 5 vectors
igb2: Ethernet address: 00:1e:67:54:9f:cf
igb2: [ITHREAD]
igb2: [ITHREAD]
igb2: [ITHREAD]
igb2: [ITHREAD]
igb2: [ITHREAD]
igb3: <Intel(R) PRO/1000 Network Connection version - 2.3.1> port 0x1000-0x101f mem 0xd2100000-0xd211ffff,0xd2180000-0xd2183fff irq 29 at device 0.3 on pci2
igb3: Using MSIX interrupts with 5 vectors
igb3: Ethernet address: 00:1e:67:54:9f:d0
igb3: [ITHREAD]
igb3: [ITHREAD]
igb3: [ITHREAD]
igb3: [ITHREAD]
igb3: [ITHREAD]
pcib3: <ACPI PCI-PCI bridge> irq 47 at device 2.0 on pci0
pci4: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> irq 47 at device 2.2 on pci0
pci5: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> irq 16 at device 3.0 on pci0
pci6: <ACPI PCI bus> on pcib5
aacu0: <Adaptec RAID Controller> mem 0xd1c00000-0xd1ffffff,0xd2050000-0xd20507ff,0xd2040000-0xd20400ff irq 40 at device 0.0 on pci6
aacu0: Enable Raw I/O
aacu0: Enable 64-bit array
aacu0: New comm. interface type1 enabled
aacu0: [ITHREAD]
aacu0: Adaptec 6805, aac driver 3.1.2-30035
aacp0: <Container Bus> on aacu0
aacp1: <SCSI Passthrough Bus> on aacu0
aacp2: <SCSI Passthrough Bus> on aacu0
aacp3: <SCSI Passthrough Bus> on aacu0
pcib6: <ACPI PCI-PCI bridge> irq 16 at device 3.2 on pci0
pci7: <ACPI PCI bus> on pcib6
pci0: <base peripheral> at device 4.0 (no driver attached)
pci0: <base peripheral> at device 4.1 (no driver attached)
pci0: <base peripheral> at device 4.2 (no driver attached)
pci0: <base peripheral> at device 4.3 (no driver attached)
pci0: <base peripheral> at device 4.4 (no driver attached)
pci0: <base peripheral> at device 4.5 (no driver attached)
pci0: <base peripheral> at device 4.6 (no driver attached)
pci0: <base peripheral> at device 4.7 (no driver attached)
pci0: <base peripheral> at device 5.0 (no driver attached)
pci0: <base peripheral> at device 5.2 (no driver attached)
pcib7: <ACPI PCI-PCI bridge> irq 16 at device 17.0 on pci0
pci8: <ACPI PCI bus> on pcib7
pci0: <simple comms> at device 22.0 (no driver attached)
pci0: <simple comms> at device 22.1 (no driver attached)
ehci0: <EHCI (generic) USB 2.0 controller> mem 0xd2320000-0xd23203ff irq 22 at device 26.0 on pci0
ehci0: [ITHREAD]
usbus0: EHCI version 1.0
usbus0: <EHCI (generic) USB 2.0 controller> on ehci0
pcib8: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci9: <ACPI PCI bus> on pcib8
pcib9: <ACPI PCI-PCI bridge> irq 19 at device 28.7 on pci0
pci10: <ACPI PCI bus> on pcib9
vgapci0: <VGA-compatible display> mem 0xd0000000-0xd0ffffff,0xd1810000-0xd1813fff,0xd1000000-0xd17fffff irq 19 at device 0.0 on pci10
ehci1: <EHCI (generic) USB 2.0 controller> mem 0xd2310000-0xd23103ff irq 20 at device 29.0 on pci0
ehci1: [ITHREAD]
usbus1: EHCI version 1.0
usbus1: <EHCI (generic) USB 2.0 controller> on ehci1
pcib10: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci11: <ACPI PCI bus> on pcib10
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
ahci0: <Intel Patsburg AHCI SATA controller> port 0x2070-0x2077,0x2060-0x2063,0x2050-0x2057,0x2040-0x2043,0x2020-0x203f mem 0xd2300000-0xd23007ff irq 21 at device 31.2 on pci0
ahci0: [ITHREAD]
ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich0: [ITHREAD]
ahcich1: <AHCI channel> at channel 1 on ahci0
ahcich1: [ITHREAD]
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
pcib11: <ACPI Host-PCI bridge> on acpi0
pci255: <ACPI PCI bus> on pcib11
pci255: <base peripheral> at device 8.0 (no driver attached)
pci255: <base peripheral> at device 9.0 (no driver attached)
pci255: <base peripheral> at device 10.0 (no driver attached)
pci255: <base peripheral> at device 10.1 (no driver attached)
pci255: <base peripheral> at device 10.2 (no driver attached)
pci255: <base peripheral> at device 10.3 (no driver attached)
pci255: <base peripheral> at device 11.0 (no driver attached)
pci255: <base peripheral> at device 11.3 (no driver attached)
pci255: <base peripheral> at device 12.0 (no driver attached)
pci255: <base peripheral> at device 12.1 (no driver attached)
pci255: <base peripheral> at device 12.6 (no driver attached)
pci255: <base peripheral> at device 12.7 (no driver attached)
pci255: <base peripheral> at device 13.0 (no driver attached)
pci255: <base peripheral> at device 13.1 (no driver attached)
pci255: <base peripheral> at device 13.6 (no driver attached)
pci255: <base peripheral> at device 14.0 (no driver attached)
pci255: <dasp> at device 14.1 (no driver attached)
pci255: <base peripheral> at device 15.0 (no driver attached)
pci255: <base peripheral> at device 15.1 (no driver attached)
pci255: <base peripheral> at device 15.2 (no driver attached)
pci255: <base peripheral> at device 15.3 (no driver attached)
pci255: <base peripheral> at device 15.4 (no driver attached)
pci255: <base peripheral> at device 15.5 (no driver attached)
pci255: <base peripheral> at device 15.6 (no driver attached)
pci255: <base peripheral> at device 16.0 (no driver attached)
pci255: <base peripheral> at device 16.1 (no driver attached)
pci255: <base peripheral> at device 16.2 (no driver attached)
pci255: <base peripheral> at device 16.3 (no driver attached)
pci255: <base peripheral> at device 16.4 (no driver attached)
pci255: <base peripheral> at device 16.5 (no driver attached)
pci255: <base peripheral> at device 16.6 (no driver attached)
pci255: <base peripheral> at device 16.7 (no driver attached)
pci255: <base peripheral> at device 17.0 (no driver attached)
pci255: <base peripheral> at device 19.0 (no driver attached)
pci255: <dasp> at device 19.1 (no driver attached)
pci255: <dasp> at device 19.4 (no driver attached)
pci255: <dasp> at device 19.5 (no driver attached)
pci255: <base peripheral> at device 19.6 (no driver attached)
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
atrtc0: <AT realtime clock> port 0x70-0x77 irq 8 on acpi0
atrtc0: Warning: Couldn't map I/O.
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
uart1: [FILTER]
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xc9000-0xc9fff,0xca000-0xcafff,0xcb000-0xcbfff,0xcc000-0xd27ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
ppc0: cannot reserve I/O port range
coretemp0: <CPU On-Die Thermal Sensors> on cpu0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
coretemp1: <CPU On-Die Thermal Sensors> on cpu1
est1: <Enhanced SpeedStep Frequency Control> on cpu1
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 1ee700001800
device_attach: est1 attach returned 6
p4tcc1: <CPU Frequency Thermal Control> on cpu1
coretemp2: <CPU On-Die Thermal Sensors> on cpu2
est2: <Enhanced SpeedStep Frequency Control> on cpu2
p4tcc2: <CPU Frequency Thermal Control> on cpu2
coretemp3: <CPU On-Die Thermal Sensors> on cpu3
est3: <Enhanced SpeedStep Frequency Control> on cpu3
est: CPU supports Enhanced Speedstep, but is not recognized.
est: cpu_vendor GenuineIntel, msr 1f1000001800
device_attach: est3 attach returned 6
p4tcc3: <CPU Frequency Thermal Control> on cpu3
fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.8
Timecounters tick every 1.000 msec
hpt27xx: no controller detected.
usbus0: 480Mbps High Speed USB v2.0
usbus1: 480Mbps High Speed USB v2.0
ugen0.1: <Intel> at usbus0
uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0
ugen1.1: <Intel> at usbus1
uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
uhub0: 2 ports with 2 removable, self powered
uhub1: 2 ports with 2 removable, self powered
ugen0.2: <vendor 0x8087> at usbus0
uhub2: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on usbus0
ugen1.2: <vendor 0x8087> at usbus1
uhub3: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on usbus1
uhub2: 6 ports with 6 removable, self powered
uhub3: 8 ports with 8 removable, self powered
ugen1.3: <InnoDisk> at usbus1
umass0: <InnoDisk USB EDC, class 0/0, rev 2.00/9.10, addr 3> on usbus1
ugen1.4: <vendor 0x04d9> at usbus1
ukbd0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/3.10, addr 4> on usbus1
kbd2 at ukbd0
uhid0: <vendor 0x04d9 USB Keyboard, class 0/0, rev 1.10/3.10, addr 4> on usbus1
ugen1.5: <American Megatrends Inc.> at usbus1
ukbd1: <Keyboard Interface> on usbus1
kbd3 at ukbd1
ums0: <Mouse Interface> on usbus1
ums0: 3 buttons and [Z] coordinates ID=0
ada0 at ahcich0 bus 0 scbus4 target 0 lun 0
ada0: <INTEL SSDSA2BZ100G3 6PB10362> ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 95396MB (195371568 512 byte sectors: 16H 63S/T 16383C)
ada1 at ahcich1 bus 0 scbus5 target 0 lun 0
ada1: <INTEL SSDSA2BZ100G3 6PB10362> ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 95396MB (195371568 512 byte sectors: 16H 63S/T 16383C)
da6 at umass-sim0 bus 0 scbus6 target 0 lun 0
da6: <InnoDisk USB EDC 0910> Fixed Direct Access SCSI-0 device 
da6: 40.000MB/s transfers
da6: 3920MB (8028160 512 byte sectors: 255H 63S/T 499C)
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
da0 at aacp1 bus 0 scbus1 target 0 lun 0
da0: <WDC WD30EFRX-68AX9N0 80.0> Fixed Direct Access SCSI-5 device 
da0: 300.000MB/s transfers
da0: Command Queueing enabled
da0: 2856950MB (5851033600 512 byte sectors: 255H 63S/T 364209C)
da1 at aacp1 bus 0 scbus1 target 1 lun 0
da1: <WDC WD30EFRX-68AX9N0 80.0> Fixed Direct Access SCSI-5 device 
da1: 300.000MB/s transfers
da1: Command Queueing enabled
da1: 2856950MB (5851033600 512 byte sectors: 255H 63S/T 364209C)
da2 at aacp1 bus 0 scbus1 target 4 lun 0
da2: <WDC WD30EFRX-68AX9N0 80.0> Fixed Direct Access SCSI-5 device 
da2: 300.000MB/s transfers
da2: Command Queueing enabled
da2: 2856950MB (5851033600 512 byte sectors: 255H 63S/T 364209C)
da3 at aacp1 bus 0 scbus1 target 5 lun 0
da3: <WDC WD30EFRX-68AX9N0 80.0> Fixed Direct Access SCSI-5 device 
da3: 300.000MB/s transfers
da3: Command Queueing enabled
da3: 2856950MB (5851033600 512 byte sectors: 255H 63S/T 364209C)
da4 at aacp1 bus 0 scbus1 target 6 lun 0
da4: <WDC WD30EFRX-68AX9N0 80.0> Fixed Direct Access SCSI-5 device 
da4: 300.000MB/s transfers
da4: Command Queueing enabled
da4: 2856950MB (5851033600 512 byte sectors: 255H 63S/T 364209C)
da5 at aacp1 bus 0 scbus1 target 7 lun 0
da5: <WDC WD30EFRX-68AX9N0 80.0> Fixed Direct Access SCSI-5 device 
da5: 300.000MB/s transfers
da5: Command Queueing enabled
da5: 2856950MB (5851033600 512 byte sectors: 255H 63S/T 364209C)
GEOM: da6s1: geometry does not match label (16h,63s != 255h,63s).
Trying to mount root from ufs:/dev/ufs/FreeNASs1a
ZFS filesystem version 5
ZFS storage pool version 28
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I'm starting to be out of my comfort zone. I really wish someone who worked these kinds of issues would chime in here.

My only other comment is: Where is the ZIL physically connected? You never answered that question above. If it is not connected to the motherboard then maybe you could rearrange your hard drives so the ZIL is connected to the MB. If that doesn't work then unfortunately I'm out of suggestions other than to run it without a ZIL, but that's entirely up to you and what you intend to use the NAS for.

-Mark
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Ok, I'll pipe in. The ZIL doesn't work how you think.

The ZIL acts as a non volatile storage space for data that needs to be committed to the zpool. Anything in the ZIL must also be in RAM. So having a system with 4GB of RAM and a 128GB ZIL drive is just plain stupid. This is also why the FreeNAS manual says that you should always max out your RAM before considering a ZIL or L2ARC. If your system is already starved for RAM you really aren't doing it any help by creating another use for the RAM. Data is not saved to the ZIL and removed from RAM and later commited. You can expect that any data written to the ZIL will never be read(because it is always stored in RAM) except on an improper shutdown of the system.

Here's the reason why a ZIL(and an L2ARC) is pointless for you(as well as most other people). If any transation is greater than 64kbytes then the ZIL is not used.. at all. For instance, if you start writing a 20GB file to the zpool each transaction will be very large(probably a few hundred MB), so the ZIL does absolutely nothing. If you are saving a 30kbyte office document, that may be commited to the ZIL. To be 100% precise, the write that creates any given file of any size and the write that closes the file may be saved to the ZIL. But the contents your 20GB file will be commited directly to the zpool. The contents of your 30kbyte file may be committed to the ZIL too.

Now, if your zpool isn't being heavily used then the ZIL really does nothing but perhaps save the server a millisecond or two because the server will write to the ZIL and then milliseconds later save to the zpool. For you as the user I can bet you millions of dollars you will never notice those 2-3 milliseconds. Not to mention that ZFS has an advanced read ahead cache so if you are streaming a video or something you won't even know about the other write. On the other hand if you are writing lots and lots of data all over the place and the server is so busy it can't write the data for a while(think 10+ seconds) then the ZIL may save you those 10 seconds. But I can also guarantee if you are running a production server that is so busy the time to write is measured in seconds I promise you you'll know all of this already and know to use a ZIL.

So what is ZFS good for? Databases(and iscsi to an extent). Databases because they typically write single records constantly throughout the database file(think random writes and random reads) and generally you won't see a record that comes out to be >64kbytes. That's ALOT of data for most database situations.

ISCSI also appears to see some benefit(but not a total solution for performance issues with iscsi). Imagine saving a 10MB file to an iscsi device. You have 3 main writes involved(ignoring potential file system journals and such): the file creation, the file contents, and the file closing. The file creation and closing will always be extremely small(always 4kbytes or less as they must be committed in a single write) so the creation and closing steps may be committed to the ZIL. But the file contents won't be saved to the ZIL and will be directly committed. So you have 1 big write and 2 smaller(potentially deferred until later) saving you some time in high load environments. ISCSI has several issues because to zfs it operates similar to a database of the size you specified but instead of having constant small reads and writes you have a mixture of small reads and writes(files opening, closing, and directory listings) but also writing and reading of large files(aka files >64kbytes). Additionally, because ZFS is a copy-on-write file system any write will always be to a new unused location on the zpool. This can cause excessive fragmentation that can cause your zpool to become so slow as to cause iscsi timeout issues. Since there is no defrag for ZFS the only solution is to move the iscsi device off of the zpool and back on in hopes that the newly created file will be assigned a somewhat contiguous area for reads. But this is only a temporary solution as more writes will cause fragmentation again.

So what am I saying? If you don't have an iscsi device setup on your FreeNAS zpool and you do not have a database or any other such process that does lots of random writes that are guaranteed to always be smaller than 64kbytes most of the time AND your server is heavily loaded(think hard drive lights on more than 90% of the time all day long) then your ZIL is a waste of money.

My recommendation: Get rid of your ZIL and watch your server performance not change at all but your problem will be solved. If you're going to tell me that there was a major performance hit and you aren't doing something borderline stupid with your server I'll probably start ignoring you. :p I have had people try to argue with me because they clicked save in Microsoft Office and they claimed it took 3 seconds to write the .doc file instead of 2 seconds with the ZIL so therefore the ZIL makes it faster. I just have to say to those people "if a second is really worth that much to you why are you even bothering to spend 2 minutes to grace me with a reply?" You're never going to convince me that your average user is going to call IT support or jump into a forum posting about their "slow" FreeNAS server because it took 3 seconds instead of 2 seconds to write a file.

All of this stuff I just explained is why ZIL and L2ARC is about useless for the average joe. So I always tell people "Unless you could write a book on the internal workings of ZFS" then you have no business spending money and time on a ZIL and/or L2ARC. The people that can write that book also are smart enough to know if a ZIL or L2ARC will help them for their situation.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
You know, you should put this into a presentation :cool:
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
ROFL ! I was joking! I'm laughing to hard I screwed up this message a few times.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I figured as much. That's why I said "Silly Joe".
 

gregober

Dabbler
Joined
Sep 30, 2012
Messages
17
Hi, again,

This is indeed a very interesting answer and I would like to thank you for the time It took you to write It.

I do understand that from your point of view I look like someone that does not really need a ZIL (and that could be true).

But on the other hand I am about to offer (should I say resale ?) some piece(s) of hardware which in turn might interest persons who will need a ZIL, and obviously as I am doing my job in a professional way, I like to test and understand what I am reselling. Furthermore, I am planing to use this server (after the testing period will end) as a iSCSI device (though I doubt It'll be heavily used).

So my short answer is : no I don't really need a ZIL - yes I do need to understand and learn more about the way It is working and how to use It.

Lastly : I don't really understand why a ZIL could block a reboot procedure…*And If It is because of a ZIL problem, I'd like to learn how to tune this.


Hope you understand my point of view on the problem…*Simply put : learning and understanding how to solve problems.



Thanks a lot.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yeah, at this point I can't help you. I have no idea why the ZIL isn't working right. To be completely honest, if you are selling this as a product and you intend to use iscsi you should consider going with UFS instead of ZFS.

If the iscsi was going to be read-only, then I might consider it for ZFS. But because of how many people have had iscsi failures because ZFS couldn't keep up with the workload(and you don't want customers calling you 2 months later complaining because the server is completely unusable) you should consider doing something. We've seen people that had less than 2MB/sec over their iscsi device in less than 2 months of server uptime. Additionally, if the zpool uses CIFS and iscsi, when iscsi performance tanks it will tank CIFS too because the zpool just can't keep up. I promise you that you will eventually get phone calls. It's a matter of when, not if.

See support ticket at http://support.freenas.org/ticket/1531. None of them are really saviors and some of them come with considerable risk to your data(have religious backups!) or considerable hardware costs. The most commonly recommended solution is to go to UFS for iscsi. The issue is that the method behind how iscsi works and how ZFS works aren't in line with each other. Because of the conflict between these 2 methodologies performance starts out very high and only goes downhill. This has been hashed out so many times between ZFS and iscsi that those senior members involved have given up all hope and stopped discussing the topic in the forum because the information was posted 2-3 times a week for quite a while. There is no "fix" for it with ZFS and never will be.

Edit: Odd thought I just came up with. You said on reboot it will lock up. What happens if you try to do a shutdown? I realize this won't work for remote access situations, but the way a reboot and shutdown work is slightly different and may change the result and may provide some insight into what's wrong.
 

gregober

Dabbler
Joined
Sep 30, 2012
Messages
17
Yeah, at this point I can't help you. I have no idea why the ZIL isn't working right. To be completely honest, if you are selling this as a product and you intend to use iscsi you should consider going with UFS instead of ZFS.

Well, snapshots are quite convenient tools…*
And ZFS also seems quite stable (at least It has been for the various usage that I have of It). But this is not for an iSCSI usage which I have only been testing very lightly.

If the iscsi was going to be read-only, then I might consider it for ZFS. But because of how many people have had iscsi failures because ZFS couldn't keep up with the workload(and you don't want customers calling you 2 months later complaining because the server is completely unusable) you should consider doing something. We've seen people that had less than 2MB/sec over their iscsi device in less than 2 months of server uptime. Additionally, if the zpool uses CIFS and iscsi, when iscsi performance tanks it will tank CIFS too because the zpool just can't keep up. I promise you that you will eventually get phone calls. It's a matter of when, not if.

Ok - good to know.

See support ticket at http://support.freenas.org/ticket/1531. None of them are really saviors and some of them come with considerable risk to your data(have religious backups!) or considerable hardware costs. The most commonly recommended solution is to go to UFS for iscsi. The issue is that the method behind how iscsi works and how ZFS works aren't in line with each other. Because of the conflict between these 2 methodologies performance starts out very high and only goes downhill. This has been hashed out so many times between ZFS and iscsi that those senior members involved have given up all hope and stopped discussing the topic in the forum because the information was posted 2-3 times a week for quite a while. There is no "fix" for it with ZFS and never will be.

Ok - but doesn't sun / Oracle comes up with some solution to this ?
I know It's not FreeBSD, but iSCSI with ZFS is not so uncommon (or am I missing smthg) ?

Edit: Odd thought I just came up with. You said on reboot it will lock up. What happens if you try to do a shutdown? I realize this won't work for remote access situations, but the way a reboot and shutdown work is slightly different and may change the result and may provide some insight into what's wrong.

I am away from my server, I'll try that tomorow morning (Paris time) and report.

Thanks a lot, It's nice having feedback from someone with obviously more expertise than I do on the subject.


Thanks.

G.B.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
iSCSI on ZFS is only as common as you think it is. Personally, I've seen very very few systems use iSCSI. They were almost always read-only and had a separate location for their data.

There is not and never will be a fix without some major overhaul of ZFS. ZFS is Copy-On-Write(this is the entire basis ZFS is based on) and CoW is not a good idea for ZFS. They just aren't a good fit together. Of course, I'm sure there are people out there making 150k/year and could tweak it to make it work beautifully, but the issue is you need extensive internal knowledge of ZFS and your load type to tweak well. As I've said before though, iscsi uses large and small writes, so tweaking is at the very least difficult. The easiest fix I can "see" is adding defrag. But since ZFS is closed source by Oracle now don't expect to see the source code soon, if ever. Also defrag isn't a feature that is high on the "to-do" list by Oracle.

This is why we just tell people to use UFS if you plan to use iscsi. But this thread has come full circle(as every thread discussing ZILs does) and we're back to you saying you may sell servers with it, but then you said you've only lightly tested it. So my advice is still to pull out the ZIL. If a customer wants iscsi then go UFS for their iscsi device. Snapshots won't work so well for iscsi anyway for iscsi devices that are open. Your "backup" can be corrupted garbage. Backups of iscsi devices should be handled from within the system using it or by disconnecting the iscsi device only.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,403
Anything in the ZIL must also be in RAM.
:confused:

So having a system with 4GB of RAM and a 128GB ZIL drive is just plain stupid.
Well a 128GB SLOG is stupid as you can likely get away with no more than a few GBs. Unless, you bought a larger drive for it's increased write IOPS.

If any transation is greater than 64kbytes then the ZIL is not used.. at all.
Wrong on two counts. The value is 32KB now. When using a SLOG all sync writes go through the SLOG as [post=49529]jgreco thought/knew[/post] and I also confirmed.

So what am I saying? If you don't have an iscsi device setup on your FreeNAS zpool and you do not have a database or any other such process that does lots of random writes that are guaranteed to always be smaller than 64kbytes most of the time AND your server is heavily loaded(think hard drive lights on more than 90% of the time all day long) then your ZIL is a waste of money.
Heavy NFS, i.e. lots of sync writes, is also a use case for a SLOG.

I'm just saying that the ZIL and L2ARC are so misunderstood that its borderline crazy how many people somehow jump to the conclusion that they think they need a ZIL and/or L2ARC and complain because the system isn't as awesome as they had hoped.
Yes, but they are on SSDs and magically make it faster. :rolleyes:

To be completely honest, if you are selling this as a product and you intend to use iscsi you should consider going with UFS instead of ZFS.
Not a bad idea until the system is extensively tested with iSCSI on ZFS and the performance & limitations are understood.

We've seen people that had less than 2MB/sec over their iscsi device in less than 2 months of server uptime.
Were they actually competent though?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I'm still waiting to find out where the ZILs are physically connected in the system. I don't know if it would have any bearing but it's possible and if they are not connected to the MB then you may need to move them and test your system that way to see if it makes a difference.
 
Status
Not open for further replies.
Top