iSCSI dropping connection

Arun Gupta

Dabbler
Joined
Dec 22, 2013
Messages
37
I have been wrestling with this issue for last few days. Searched with Google and on this forum, but cannot find any help. Here is my setup (this is a home lab, so not critical):

I have setup totally new FreeNAS 11.2U5 as VM on VMWare ESXi 6.5. The host is a Dell T110 server. FreeNAS VM is currently allocated 10GB memory. I have attached a 10TB disk to the host and created virtual disks of approx. 2TB each and attached them as disks to FreeNAS. The only sharing service configured is iSCSI. I have configured target, initiators, extents, the whole bit. Then I configured two Linux 7 VMs running on another host (Dell T420 server) and configured them as initiator to use the iSCSI disks. I can see the disks and use them without any problem, but every 20 minutes, the following message is logged on the FreeNAS console:

Sep 15 12:07:31 agismbgnas01 WARNING: 192.168.250.70 (iqn.2014-10.com.garimallc:lin12corc01): connection error; dropping connection
Sep 15 12:28:14 agismbgnas01 WARNING: 192.168.250.70 (iqn.2014-10.com.garimallc:lin12corc01): connection error; dropping connection
Sep 15 12:49:19 agismbgnas01 WARNING: 192.168.250.70 (iqn.2014-10.com.garimallc:lin12corc01): connection error; dropping connection
Sep 15 13:09:50 agismbgnas01 WARNING: 192.168.250.70 (iqn.2014-10.com.garimallc:lin12corc01): connection error; dropping connection

On the Linux VM, I see corresponding messages:

Sep 15 12:07:52 lin12corc01 iscsid: Kernel reported iSCSI connection 4:0 error (1022 - Invalid or unknown error code) state (3)
Sep 15 12:08:17 lin12corc01 iscsid: connection4:0 is operational after recovery (2 attempts)
Sep 15 12:28:34 lin12corc01 iscsid: Kernel reported iSCSI connection 4:0 error (1022 - Invalid or unknown error code) state (3)
Sep 15 12:29:22 lin12corc01 iscsid: connection4:0 is operational after recovery (4 attempts)
Sep 15 12:49:39 lin12corc01 iscsid: Kernel reported iSCSI connection 4:0 error (1022 - Invalid or unknown error code) state (3)
Sep 15 12:49:56 lin12corc01 iscsid: connection4:0 is operational after recovery (2 attempts)
Sep 15 13:10:10 lin12corc01 iscsid: Kernel reported iSCSI connection 4:0 error (1022 - Invalid or unknown error code) state (3)
Sep 15 13:10:27 lin12corc01 iscsid: connection4:0 is operational after recovery (2 attempts)

There is no load on the system. The problem I am facing because of this issue is that I use these iSCSI disks for Oracle clusterware, ASM and databases. The disks keep disconnecting and reconnecting and ASM can generally tolerate this, but if the recovery attempts exceed 8, ASM drops the disks and databases crash.

To isolate the issue, I have stopped and disabled entire Oracle stack, so absolutely nothing is using the iSCSI disks, There is only one Linux 7 VM running and there is zero load on the system. These messages are still logged. As per various suggestions, I have played around with noop timings in iscsid.conf, but no luck. Jumbo frames are not enabled anywhere. I don't know what else to check.

Any help will be greatly appreciated.

Thanks...!!
Arun
 

Arun Gupta

Dabbler
Joined
Dec 22, 2013
Messages
37
Yes. On the Linux VM, I logged out of the iSCSI session, deleted the iSCSI configuration using iscsiadm, then deleted the iSCSI database by deleting the iscsi directory under /var/lib. Then I created brand new session, discovered target and logged into the target. All the disks were visible. Since then, the iSCSI has been rock solid. No issues.
 
Top