Joseph Bucar
Cadet
- Joined
- Mar 23, 2017
- Messages
- 6
Virtualization Environment:
6 Dell VMware hosts connected via multiple 10GB vlans thru HPE Flexfabric switches
Configuration:
Dell 730xd - 192GB RAM - [LOG] Intel Optane 900P (280GB) - [CACHE] (2) 1TB PCI Intel NvME SSD - FreeNAS 11.3-U5
Problem:
Our problem is with our poolSSD1. We are suffering horrible performance on this pool with bad read/write latency numbers. What is sad is that our 7K pool is
outperforming the SSD pool it appears. We are also seeing the following messages in the FreeNAS shell:
Jan 25 09:38:15 dcXXXSAN02 ctl_datamove: tag 0x1549cb on (4:17:113) aborted
Jan 25 09:38:15 dcXXXSAN02 ctl_datamove: tag 0x1549cd on (4:17:113) aborted
Jan 25 09:38:15 dcXXXSAN02 ctl_datamove: tag 0x1549d0 on (4:17:113) aborted
Jan 25 09:38:15 dcXXXSAN02 ctl_datamove: tag 0x1549d3 on (4:17:113) aborted
Jan 25 14:12:41 dcXXXSAN02 ctl_datamove: tag 0xb55bc0 on (10:24:105) aborted
Jan 25 14:18:03 dcXXXSAN02 ctl_datamove: tag 0x3f941a on (4:25:114) aborted
Jan 25 14:34:30 dcXXXSAN02 kernel: Limiting closed port RST response from 1704 to 200 packets/sec
Jan 25 14:34:30 dcXXXSAN02 kernel: Limiting closed port RST response from 1704 to 200 packets/sec
Questions:
Joe
6 Dell VMware hosts connected via multiple 10GB vlans thru HPE Flexfabric switches
Configuration:
Dell 730xd - 192GB RAM - [LOG] Intel Optane 900P (280GB) - [CACHE] (2) 1TB PCI Intel NvME SSD - FreeNAS 11.3-U5
Nics
- 2 2 Port 10GB-BaseT Intel cards, configured with a Management network and 3 VLANs (two for storage and 1 for vmotion)
- Nics are LAGGED together and setup for FAILOVER as everything is dual connected for redundancy to a HPE Flexfabric switch stack.
- Jumbo frames is turned on and verified thru the storage and vmotion vlans to be working properly.
Controllers/Enclosures
- 730xd has 24 2.5 slots which are being used for SSD only.
- 2 Dell MD1200 Enclosures with 24 total slots
- LSI External 2 port SAS Card
2 Pools
- pool7K1 - 24 8TB SAS 7K Drives - Configured to 12 Mirrored vdevs - log is Intel Optane - cache is 1TB PCI Intel NvME SSD - pool is HEALTHY: (57%) Used / 36.06 TiB Free
- poolSSD1 - 12 3.84TB SAS Dell Enterprise SSD - Configured to 6 Mirrored vdevs - no log - cache is 1TB PCI Intel NvME SSD - pool is HEALTHY: (50%) Used / 8.27 TiB Free
Our problem is with our poolSSD1. We are suffering horrible performance on this pool with bad read/write latency numbers. What is sad is that our 7K pool is
outperforming the SSD pool it appears. We are also seeing the following messages in the FreeNAS shell:
Jan 25 09:38:15 dcXXXSAN02 ctl_datamove: tag 0x1549cb on (4:17:113) aborted
Jan 25 09:38:15 dcXXXSAN02 ctl_datamove: tag 0x1549cd on (4:17:113) aborted
Jan 25 09:38:15 dcXXXSAN02 ctl_datamove: tag 0x1549d0 on (4:17:113) aborted
Jan 25 09:38:15 dcXXXSAN02 ctl_datamove: tag 0x1549d3 on (4:17:113) aborted
Jan 25 14:12:41 dcXXXSAN02 ctl_datamove: tag 0xb55bc0 on (10:24:105) aborted
Jan 25 14:18:03 dcXXXSAN02 ctl_datamove: tag 0x3f941a on (4:25:114) aborted
Jan 25 14:34:30 dcXXXSAN02 kernel: Limiting closed port RST response from 1704 to 200 packets/sec
Jan 25 14:34:30 dcXXXSAN02 kernel: Limiting closed port RST response from 1704 to 200 packets/sec
Questions:
- Can someone explain what the ctl_datamove error is and what might cause it?
- What does the 99:99:99 refer to? It appears from looking at it to me that the last number is the LUN in question with the issue.
- What does the tag hex value refer to?
- We have a 2nd, identical setup FreeNAS which does not suffer the same problems.
- Is there any suggestions or possible issues with this setup?
Joe