Inconsistent number of disks at boot

union

Dabbler
Joined
Apr 10, 2019
Messages
13
Hello,

I am having an issue with my system where on boot a different number of disks are showing up each time. There are 23 attached but every time I reboot a different amount are showing as attached, meaning I can't create the pool I want to, then on the off chance all the disks are attached after reboot I can create the pool but the next time I reboot the pool becomes degraded because all the disks haven't attached.

Code:
root@kessel:~ # dmesg | grep mps0
mps0: <Avago Technologies (LSI) SAS2008> port 0xc000-0xc0ff mem 0xfab3c000-0xfab3ffff,0xfab40000-0xfab7ffff irq 16 at device 0.0 on pci6
mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fbsd
mps0: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
mps0: SAS Address for SATA device = 166a5b729a669d95
mps0: SAS Address for SATA device = 7754e729a669d95
mps0: SAS Address for SATA device = 1c695e729a669d95
mps0: SAS Address for SATA device = 18656e739a669db0
mps0: SAS Address for SATA device = 76d65729a669e95
mps0: SAS Address for SATA device = 1c7267729a669d95
mps0: SAS Address for SATA device = d6765729a669daf
mps0: SAS Address for SATA device = 7736e729a669d95
mps0: SAS Address for SATA device = 166d6d729a669db4
mps0: SAS Address for SATA device = 16566a729a669d95
mps0: SAS Address for SATA device = 265f5c729a669e9e
mps0: SAS Address for SATA device = 2c53486d9a669d93
mps0: SAS Address for SATA device = 46d596d9a669d94
mps0: SAS Address for SATA device = 77549729a669d95
mps0: SAS Address from SATA device = 166a5b729a669d95
mps0: SAS Address from SATA device = 7754e729a669d95
mps0: SAS Address from SATA device = 1c695e729a669d95
mps0: SAS Address from SATA device = 18656e739a669db0
mps0: SAS Address from SATA device = 76d65729a669e95
mps0: SAS Address from SATA device = 1c7267729a669d95
mps0: SAS Address from SATA device = d6765729a669daf
mps0: SAS Address from SATA device = 7736e729a669d95
mps0: SAS Address from SATA device = 166d6d729a669db4
mps0: SAS Address from SATA device = 16566a729a669d95
mps0: SAS Address from SATA device = 265f5c729a669e9e
mps0: SAS Address from SATA device = 2c53486d9a669d93
mps0: SAS Address from SATA device = 46d596d9a669d94
mps0: SAS Address from SATA device = 77549729a669d95
mps0: SAS Address for SATA device = 97548729a669db3
mps0: SAS Address for SATA device = 7704f729a669d95
mps0: SAS Address for SATA device = 21674b729a669e9f
mps0: SAS Address for SATA device = 1e744b729a669d95
mps0: SAS Address from SATA device = 97548729a669db3
mps0: SAS Address from SATA device = 7704f729a669d95
mps0: SAS Address from SATA device = 21674b729a669e9f
mps0: SAS Address from SATA device = 1e744b729a669d95
mps0: SAS Address for SATA device = 16736b729a669d95
mps0: SAS Address from SATA device = 16736b729a669d95
mps0: SAS Address for SATA device = 1c6d4a729a669d95
mps0: SAS Address from SATA device = 1c6d4a729a669d95
mps0: SAS Address for SATA device = 85562729a669e95
mps0: SAS Address from SATA device = 85562729a669e95
mps0: SAS Address for SATA device = d6ce7d05e1dcf0db
mps0: SAS Address from SATA device = d6ce7d05e1dcf0db
run_interrupt_driven_hooks: still waiting after 180 seconds for    (probe0:mps0:0:29:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 274 Aborting command 0xfffffe0000fbf7a0
 xpt_configmps0: Sending reset from mpssas_send_abort for target ID 29
    (probe22:mps0:0:28:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 273 Aborting command 0xfffffe0000fbf650
mps0: Sending reset from mpssas_send_abort for target ID 28
    (probe21:mps0:0:27:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 272 Aborting command 0xfffffe0000fbf500
mps0: Sending reset from mpssas_send_abort for target ID 27
    (probe20:mps0:0:12:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 271 Aborting command 0xfffffe0000fbf3b0
mps0: Sending reset from mpssas_send_abort for target ID 12
    (probe18:mps0:0:26:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 269 Aborting command 0xfffffe0000fbf110
mps0: Sending reset from mpssas_send_abort for target ID 26
    (probe17:mps0:0:23:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 268 Aborting command 0xfffffe0000fbefc0
mps0: Sending reset from mpssas_send_abort for target ID 23
    (probe16:mps0:0:22:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 267 Aborting command 0xfffffe0000fbee70
mps0: Sending reset from mpssas_send_abort for target ID 22
    (probe15:mps0:0:25:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 266 Aborting command 0xfffffe0000fbed20
mps0: Sending reset from mpssas_send_abort for target ID 25
    (probe14:mps0:0:24:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 265 Aborting command 0xfffffe0000fbebd0
mps0: Sending reset from mpssas_send_abort for target ID 24
    (probe13:mps0:0:20:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 264 Aborting command 0xfffffe0000fbea80
mps0: Sending reset from mpssas_send_abort for target ID 20
    (probe12:mps0:0:19:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 263 Aborting command 0xfffffe0000fbe930
mps0: Sending reset from mpssas_send_abort for target ID 19
    (probe11:mps0:0:21:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 262 Aborting command 0xfffffe0000fbe7e0
mps0: Sending reset from mpssas_send_abort for target ID 21
    (probe10:mps0:0:18:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 261 Aborting command 0xfffffe0000fbe690
mps0: Sending reset from mpssas_send_abort for target ID 18
    (probe9:mps0:0:17:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 260 Aborting command 0xfffffe0000fbe540
mps0: Sending reset from mpssas_send_abort for target ID 17
    (probe8:mps0:0:32:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 259 Aborting command 0xfffffe0000fbe3f0
mps0: Sending reset from mpssas_send_abort for target ID 32
mps0: Unfreezing devq for target ID 27
mps0: Unfreezing devq for target ID 12
mps0: Unfreezing devq for target ID 26
mps0: Unfreezing devq for target ID 23
mps0: Unfreezing devq for target ID 22
mps0: Unfreezing devq for target ID 25
mps0: Unfreezing devq for target ID 20
mps0: Unfreezing devq for target ID 21
    (probe7:mps0:0:16:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 258 Aborting command 0xfffffe0000fbe2a0
mps0: Sending reset from mpssas_send_abort for target ID 16
    (probe6:mps0:0:15:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 257 Aborting command 0xfffffe0000fbe150
mps0: Sending reset from mpssas_send_abort for target ID 15
    (probe5:mps0:0:13:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 256 Aborting command 0xfffffe0000fbe000
mps0: Sending reset from mpssas_send_abort for target ID 13
    (probe4:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 255 Aborting command 0xfffffe0000fbdeb0
mps0: Sending reset from mpssas_send_abort for target ID 14
    (probe3:mps0:0:11:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 254 Aborting command 0xfffffe0000fbdd60
mps0: Sending reset from mpssas_send_abort for target ID 11
    (probe2:mps0:0:10:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 253 Aborting command 0xfffffe0000fbdc10
mps0: Sending reset from mpssas_send_abort for target ID 10
    (probe1:mps0:0:9:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 252 Aborting command 0xfffffe0000fbdac0
mps0: Sending reset from mpssas_send_abort for target ID 9
(probe21:mps0:0:27:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe21:mps0:0:27:0): CAM status: Command timeout
mps0: (probe21:Unfreezing devq for target ID 19
mps0:0:27:0): Retrying command
mps0: Unfreezing devq for target ID 18
mps0: Unfreezing devq for target ID 32
mps0: Unfreezing devq for target ID 16
mps0: Unfreezing devq for target ID 15
mps0: Unfreezing devq for target ID 14
(probe20:mps0:0:12:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe20:mps0:0:12:0): CAM status: Command timeout
(probe20:mps0:0:12:0): Retrying command
(probe18:mps0:0:26:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe18:mps0:0:26:0): CAM status: Command timeout
(probe18:mps0:0:26:0): Retrying command
(probe17:mps0:0:23:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe17:mps0:0:23:0): CAM status: Command timeout
(probe17:mps0:0:23:0): Retrying command
(probe16:mps0:0:22:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe16:mps0:0:22:0): CAM status: Command timeout
(probe16:mps0:0:22:0): Retrying command
(probe15:mps0:0:25:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe15:mps0:0:25:0): CAM status: Command timeout
(probe15:mps0:0:25:0): Retrying command
(probe13:mps0:0:20:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe13:mps0:0:20:0): CAM status: Command timeout
(probe13:mps0:0:20:0): Retrying command
(probe11:mps0:0:21:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe11:mps0:0:21:0): CAM status: Command timeout
(probe11:mps0:0:21:0): Retrying command
(probe12:mps0:0:19:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe12:mps0:0:19:0): CAM status: Command timeout
mps0: (probe12:mps0:0:Unfreezing devq for target ID 11
mps0: 19:0): Retrying command
mps0: Unfreezing devq for target ID 9
(probe10:mps0:0:18:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe10:mps0:0:18:0): CAM status: Command timeout
(probe10:mps0:0:18:0): Retrying command
(probe8:mps0:0:32:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe8:mps0:0:32:0): CAM status: Command timeout
(probe8:mps0:0:32:0): Retrying command
(probe7:mps0:0:16:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe7:mps0:0:16:0): CAM status: Command timeout
(probe7:mps0:0:16:0): Retrying command
(probe6:mps0:0:15:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe6:mps0:0:15:0): CAM status: Command timeout
(probe6:mps0:0:15:0): Retrying command
(probe4:mps0:0:14:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe4:mps0:0:14:0): CAM status: Command timeout
(probe4:mps0:0:14:0): Retrying command
(probe3:mps0:0:11:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe3:mps0:0:11:0): CAM status: Command timeout
(probe3:mps0:0:11:0): Retrying command
(probe2:mps0:0:10:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe2:mps0:0:10:0): CAM status: Command timeout
(probe2:mps0:0:10:0): Retrying command
(probe1:mps0:0:9:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe1:mps0:0:9:0): CAM status: Command timeout
(probe1:mps0:0:9:0): Retrying command
mps0: Unfreezing devq for target ID 24
(probe14:mps0:0:24:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe14:mps0:0:24:0): CAM status: Command timeout
(probe14:mps0:0:24:0): Retrying command
    (probe14:mps0:0:24:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 304 terminated ioc 804b loginfo 31170000 scsi 0 state c xfer 0
(probe14:mps0:0:24:0): INQUIRY. CDB: 12 00 00 00 24 00
mps0: (probe14:mps0:0:24:0): CAM status: CCB request completed with an error
(probe14:mps0:0:24:0): Retrying command
mps0: mpssas_prepare_remove: Sending reset for target ID 13
mps0: mpssas_prepare_remove: Sending reset for target ID 17
mps0: mpssas_prepare_remove: Sending reset for target ID 28
(probe5:mps0:0:13:0): INQUIRY. CDB: 12 00 00 00 24 00
mps0: (probe5:mps0:0:13:0): CAM status: Command timeout
(probe5:mps0:0:13:0): Retrying command
mps0: Unfreezing devq for target ID 13
mps0: SAS Address for SATA device = 76d65729a669e95
mps0: SAS Address from SATA device = 76d65729a669e95
    (probe14:mps0:0:24:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 313 terminated ioc 804b loginfo 31170000 scsi 0 state 0 xfer 0
(probe14:mps0:0:24:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe14:mps0:0:24:0): CAM status: CCB request completed with an error
(probe14:mps0:0:24:0): Retrying command
(probe0:mps0:0:29:0): INQUIRY. CDB: 12 00 00 00 24 00
mps0: (probe0:mps0:0:29:0): CAM status: Command timeout
mps0: (probe0:Unfreezing devq for target ID 28
mps0: Unfreezing devq for target ID 29
mps0: mps0:0:Unfreezing devq for target ID 28
(probe22:mps0:0:28:0): INQUIRY. CDB: 12 00 00 00 24 00
(probe22:mps0:0:28:0): CAM status: Command timeout
(probe22:mps0:0:28:0): Retrying command
(probe9:mps0:0:17:0): INQUIRY. CDB: 12 00 00 00 24 00
mps0: (probe9:mps0:0:17:0): CAM status: Command timeout
(probe9:mps0: mps0:0:Unfreezing devq for target ID 17
mps0: SAS Address for SATA device = 166d6d729a669db4
mps0: SAS Address from SATA device = 166d6d729a669db4
mps0: Sleeping 3 seconds after SATA ID error to wait for spinup
mps0: mpssas_ata_id_timeout checking ATA ID command 0xfffffe0000fc5a10 sc 0xfffffe0000f7d000
mps0: ATA ID command timeout cm 0xfffffe0000fc5a10
mpssas_get_sata_identify: request for page completed with error 0mps0: Sleeping 3 seconds after SATA ID error to wait for spinup
mps0: mpssas_ata_id_timeout checking ATA ID command 0xfffffe0000fc65e0 sc 0xfffffe0000f7d000
mps0: ATA ID command timeout cm 0xfffffe0000fc65e0
mpssas_get_sata_identify: request for page completed with error 0mps0: Sleeping 3 seconds after SATA ID error to wait for spinup
mps0: mpssas_ata_id_timeout checking ATA ID command 0xfffffe0000fc6730 sc 0xfffffe0000f7d000
mps0: ATA ID command timeout cm 0xfffffe0000fc6730
mpssas_get_sata_identify: request for page completed with error 0mps0: Sleeping 3 seconds after SATA ID error to wait for spinup
mps0: mpssas_ata_id_timeout checking ATA ID command 0xfffffe0000fc6880 sc 0xfffffe0000f7d000
mps0: ATA ID command timeout cm 0xfffffe0000fc6880
mpssas_get_sata_identify: request for page completed with error 0mps0: Sleeping 3 seconds after SATA ID error to wait for spinup
mps0: mpssas_add_device: failed to get disk type (SSD or HDD) for SATA device with handle 0x001f
mps0: mpssas_add_device: sending Target Reset for stuck SATA identify command (cm = 0xfffffe0000fc5a10)
    (noperiph:mps0:0:31:0): SMID 27 sending target reset
mps0: mpssas_action_scsiio: Freezing devq for target ID 31
    (xpt0:mps0:0:31:ffffffff): SMID 27 recovery finished after target reset
(probe9:mps0:0:31:0): INQUIRY. CDB: 12 00 00 00 24 00


I get the above output in dmesg, the above is a truncated version of the output but its much of the same, over and over.

If anybody could be of assistance that would be appreciated.

Thank you,
Matt
 

pschatz100

Guru
Joined
Mar 30, 2014
Messages
1,184
It could be that FreeNAS is starting before all the disks are ready. Assuming that you do not have an issue with your power supplies, I would check your motherboard bios settings and enable staggered spin-up in the IDE/SATA settings.
 

union

Dabbler
Joined
Apr 10, 2019
Messages
13
@pschatz100 Thank you for the reply. I replaced the card and flashed it with the latest firmware and the bios because I wasn't using the bios before. Now the pool seems to come up okay, it now seems like there is enough time to initialise the disks before the OS starts.

Now I have created my pool, my cache disk shows as unavailable after reboot because it looks like it has a different PID, when I do a
Code:
zpool status
I get this:
Code:
root@kessel:/dev/gptid # zpool status
  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:08:29 with 0 errors on Tue Oct 15 03:53:30 2019
config:

    NAME        STATE     READ WRITE CKSUM
    freenas-boot  ONLINE       0     0     0
      da22p2    ONLINE       0     0     0

errors: No known data errors

  pool: pool1
 state: ONLINE
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://illumos.org/msg/ZFS-8000-2Q
  scan: none requested
config:

    NAME                                            STATE     READ WRITE CKSUM
    pool1                                           ONLINE       0     0     0
      raidz2-0                                      ONLINE       0     0     0
        gptid/ec027c0f-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/ecc946ac-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/ed938228-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/ee65b7cf-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/ef395342-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f00e4434-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f0eb5544-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f1c61729-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f2b03894-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f3862087-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f45c214b-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
      raidz2-1                                      ONLINE       0     0     0
        gptid/f54632d3-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f61f24c1-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f6f389db-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f7cccb5a-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f8a0c031-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/f9919d63-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/fa6f8f60-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/fb4e0114-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/fc45512e-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/fd240ba3-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
        gptid/fe1b91f3-ee8e-11e9-9142-003048f3b32e  ONLINE       0     0     0
    cache
      2586918424065005860                           UNAVAIL      0     0     0  was /dev/gptid/ff9af361-ee8e-11e9-9142-003048f3b32e

errors: No known data errors


Do you know why this might be? Or how to consistently attach it to the pool?

Thank you,
Matt
 

pschatz100

Guru
Joined
Mar 30, 2014
Messages
1,184
I don't have experience with this myself. However, when other people have asked for guidance, they get the following response: The answer is in the FreeNAS manual; you'd use the Volume Manager. Select your existing pool under "volume to extend", select your device, and set the drop-down to "cache device".
 
Top