Parity check found errors


Recommended Posts

Hi everyone,

Sorry if there is another similar post but I couldn't find it with search.

I am using unraid for 1 year and I am very satisfied and without any problems until today. Every 1st day of every month I have scheduled a parity check without any errors until now. Today parity check give me 593 errors and I don't know what to do.

My data seems to be ok and all the services is working ok.

Does anyone have any advice what to do?

I haven't made the latest update v6.3.3 and I am running v6.3.2 because this will restart the system.

lanstorage-diagnostics-20170401-1653.zip

Link to comment

Looks to me like the errors were caused by the SAS2LP timing out, you're not the first, notice that they started exactly 60 seconds before the SAS2LP error:

 

Apr  1 03:46:32 lanstorage kernel: md: recovery thread: P corrected, sector=1652181888
Apr  1 03:46:32 lanstorage kernel: md: recovery thread: P corrected, sector=1652181896
Apr  1 03:46:32 lanstorage kernel: md: recovery thread: P corrected, sector=1652181904
Apr  1 03:46:32 lanstorage kernel: md: recovery thread: stopped logging
Apr  1 03:47:32 lanstorage kernel: sas: Enter sas_scsi_recover_host busy: 3 failed: 3
Apr  1 03:47:32 lanstorage kernel: sas: trying to find task 0xffff88012ba8c900
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: aborting task 0xffff88012ba8c900
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: task 0xffff88012ba8c900 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: sas_eh_handle_sas_errors: task 0xffff88012ba8c900 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: trying to find task 0xffff88016fda8400
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: aborting task 0xffff88016fda8400
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: task 0xffff88016fda8400 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: sas_eh_handle_sas_errors: task 0xffff88016fda8400 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: trying to find task 0xffff88015edccd00
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: aborting task 0xffff88015edccd00
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: task 0xffff88015edccd00 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: sas_eh_handle_sas_errors: task 0xffff88015edccd00 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: ata15: end_device-2:6: cmd error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata13: end_device-2:4: cmd error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata9: end_device-2:0: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata10: end_device-2:1: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata11: end_device-2:2: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata12: end_device-2:3: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata13: end_device-2:4: dev error handler
Apr  1 03:47:32 lanstorage kernel: ata13.00: exception Emask 0x0 SAct 0x28000000 SErr 0x0 action 0x6 frozen
Apr  1 03:47:32 lanstorage kernel: ata13.00: failed command: READ FPDMA QUEUED
Apr  1 03:47:32 lanstorage kernel: ata13.00: cmd 60/00:00:40:4f:7a/04:00:62:00:00/40 tag 27 ncq dma 524288 in
Apr  1 03:47:32 lanstorage kernel:         res 40/00:bc:90:2f:bf/00:00:3c:00:00/40 Emask 0x4 (timeout)
Apr  1 03:47:32 lanstorage kernel: ata13.00: status: { DRDY }
Apr  1 03:47:32 lanstorage kernel: ata13.00: failed command: READ FPDMA QUEUED
Apr  1 03:47:32 lanstorage kernel: ata13.00: cmd 60/10:00:00:d0:cf/00:00:ae:00:00/40 tag 29 ncq dma 8192 in
Apr  1 03:47:32 lanstorage kernel:         res 40/00:cc:98:10:e0/00:00:30:00:00/40 Emask 0x4 (timeout)
Apr  1 03:47:32 lanstorage kernel: ata13.00: status: { DRDY }
Apr  1 03:47:32 lanstorage kernel: ata13: hard resetting link
Apr  1 03:47:32 lanstorage kernel: sas: ata14: end_device-2:5: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata15: end_device-2:6: dev error handler
Apr  1 03:47:32 lanstorage kernel: ata15.00: exception Emask 0x0 SAct 0x100 SErr 0x0 action 0x6 frozen
Apr  1 03:47:32 lanstorage kernel: ata15.00: failed command: READ FPDMA QUEUED
Apr  1 03:47:32 lanstorage kernel: ata15.00: cmd 60/00:00:a8:40:7a/04:00:62:00:00/40 tag 8 ncq dma 524288 in
Apr  1 03:47:32 lanstorage kernel:         res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  1 03:47:32 lanstorage kernel: ata15.00: status: { DRDY }
Apr  1 03:47:32 lanstorage kernel: ata15: hard resetting link
Apr  1 03:47:32 lanstorage kernel: sas: ata16: end_device-2:7: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: sas_form_port: phy2 belongs to port6 already(1)!
Apr  1 03:47:32 lanstorage kernel: sas: sas_form_port: phy0 belongs to port4 already(1)!
Apr  1 03:47:34 lanstorage kernel: drivers/scsi/mvsas/mv_sas.c 1435:mvs_I_T_nexus_reset for device[0]:rc= 0
Apr  1 03:47:34 lanstorage kernel: drivers/scsi/mvsas/mv_sas.c 1435:mvs_I_T_nexus_reset for device[2]:rc= 0
Apr  1 03:47:34 lanstorage kernel: ata13.00: configured for UDMA/133
Apr  1 03:47:34 lanstorage kernel: ata13: EH complete
Apr  1 03:47:35 lanstorage kernel: ata15.00: configured for UDMA/133
Apr  1 03:47:35 lanstorage kernel: ata15: EH complete
Apr  1 03:47:35 lanstorage kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 3 tries: 1

There were also issues last parity check, but they didn't cause any sync errors:

 

Mar  1 05:34:38 lanstorage kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
Mar  1 05:34:38 lanstorage kernel: sas: trying to find task 0xffff8800cddf9600
Mar  1 05:34:38 lanstorage kernel: sas: sas_scsi_find_task: aborting task 0xffff8800cddf9600
Mar  1 05:34:38 lanstorage kernel: sas: sas_scsi_find_task: task 0xffff8800cddf9600 is aborted
Mar  1 05:34:38 lanstorage kernel: sas: sas_eh_handle_sas_errors: task 0xffff8800cddf9600 is aborted
Mar  1 05:34:38 lanstorage kernel: sas: ata9: end_device-2:0: cmd error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata9: end_device-2:0: dev error handler
Mar  1 05:34:38 lanstorage kernel: ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar  1 05:34:38 lanstorage kernel: ata9.00: failed command: READ DMA EXT
Mar  1 05:34:38 lanstorage kernel: ata9.00: cmd 25/00:00:78:87:c9/00:04:98:00:00/e0 tag 29 dma 524288 in
Mar  1 05:34:38 lanstorage kernel:         res 40/00:00:1f:ef:a8/00:00:0d:00:00/ed Emask 0x4 (timeout)
Mar  1 05:34:38 lanstorage kernel: ata9.00: status: { DRDY }
Mar  1 05:34:38 lanstorage kernel: ata9: hard resetting link
Mar  1 05:34:38 lanstorage kernel: sas: ata10: end_device-2:1: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata11: end_device-2:2: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata12: end_device-2:3: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata13: end_device-2:4: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata14: end_device-2:5: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata15: end_device-2:6: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata16: end_device-2:7: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: sas_form_port: phy0 belongs to port0 already(1)!
Mar  1 05:34:40 lanstorage kernel: drivers/scsi/mvsas/mv_sas.c 1435:mvs_I_T_nexus_reset for device[0]:rc= 0
Mar  1 05:34:41 lanstorage kernel: ata9.00: configured for UDMA/133
Mar  1 05:34:41 lanstorage kernel: ata9: EH complete
Mar  1 05:34:41 lanstorage kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
Mar  1 12:01:21 lanstorage kernel: md: sync done. time=41479sec
Mar  1 12:01:21 lanstorage kernel: md: recovery thread: completion status: 0

 

Look for a bios update, If you have vt-d enable and don't need it disable it, try a different pcie slot if available and run another check, if errors persist consider replacing the controller.

Link to comment

Thanks for the quick reply. I think that my cpu G620 doesn't support VT-d but only VT-x. I will check it out.

 

I think that most likely to be the controller, now that you mention it, despite the Bios. Because for one year does not show any problems and I have not changed any settings.

 

Is there a case to blame the power supply;

Link to comment
1 hour ago, johnnie.black said:

Looks to me like the errors were caused by the SAS2LP timing out, you're not the first, notice that they started exactly 60 seconds before the SAS2LP error:

 


Apr  1 03:46:32 lanstorage kernel: md: recovery thread: P corrected, sector=1652181888
Apr  1 03:46:32 lanstorage kernel: md: recovery thread: P corrected, sector=1652181896
Apr  1 03:46:32 lanstorage kernel: md: recovery thread: P corrected, sector=1652181904
Apr  1 03:46:32 lanstorage kernel: md: recovery thread: stopped logging
Apr  1 03:47:32 lanstorage kernel: sas: Enter sas_scsi_recover_host busy: 3 failed: 3
Apr  1 03:47:32 lanstorage kernel: sas: trying to find task 0xffff88012ba8c900
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: aborting task 0xffff88012ba8c900
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: task 0xffff88012ba8c900 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: sas_eh_handle_sas_errors: task 0xffff88012ba8c900 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: trying to find task 0xffff88016fda8400
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: aborting task 0xffff88016fda8400
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: task 0xffff88016fda8400 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: sas_eh_handle_sas_errors: task 0xffff88016fda8400 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: trying to find task 0xffff88015edccd00
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: aborting task 0xffff88015edccd00
Apr  1 03:47:32 lanstorage kernel: sas: sas_scsi_find_task: task 0xffff88015edccd00 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: sas_eh_handle_sas_errors: task 0xffff88015edccd00 is aborted
Apr  1 03:47:32 lanstorage kernel: sas: ata15: end_device-2:6: cmd error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata13: end_device-2:4: cmd error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata9: end_device-2:0: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata10: end_device-2:1: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata11: end_device-2:2: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata12: end_device-2:3: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata13: end_device-2:4: dev error handler
Apr  1 03:47:32 lanstorage kernel: ata13.00: exception Emask 0x0 SAct 0x28000000 SErr 0x0 action 0x6 frozen
Apr  1 03:47:32 lanstorage kernel: ata13.00: failed command: READ FPDMA QUEUED
Apr  1 03:47:32 lanstorage kernel: ata13.00: cmd 60/00:00:40:4f:7a/04:00:62:00:00/40 tag 27 ncq dma 524288 in
Apr  1 03:47:32 lanstorage kernel:         res 40/00:bc:90:2f:bf/00:00:3c:00:00/40 Emask 0x4 (timeout)
Apr  1 03:47:32 lanstorage kernel: ata13.00: status: { DRDY }
Apr  1 03:47:32 lanstorage kernel: ata13.00: failed command: READ FPDMA QUEUED
Apr  1 03:47:32 lanstorage kernel: ata13.00: cmd 60/10:00:00:d0:cf/00:00:ae:00:00/40 tag 29 ncq dma 8192 in
Apr  1 03:47:32 lanstorage kernel:         res 40/00:cc:98:10:e0/00:00:30:00:00/40 Emask 0x4 (timeout)
Apr  1 03:47:32 lanstorage kernel: ata13.00: status: { DRDY }
Apr  1 03:47:32 lanstorage kernel: ata13: hard resetting link
Apr  1 03:47:32 lanstorage kernel: sas: ata14: end_device-2:5: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: ata15: end_device-2:6: dev error handler
Apr  1 03:47:32 lanstorage kernel: ata15.00: exception Emask 0x0 SAct 0x100 SErr 0x0 action 0x6 frozen
Apr  1 03:47:32 lanstorage kernel: ata15.00: failed command: READ FPDMA QUEUED
Apr  1 03:47:32 lanstorage kernel: ata15.00: cmd 60/00:00:a8:40:7a/04:00:62:00:00/40 tag 8 ncq dma 524288 in
Apr  1 03:47:32 lanstorage kernel:         res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  1 03:47:32 lanstorage kernel: ata15.00: status: { DRDY }
Apr  1 03:47:32 lanstorage kernel: ata15: hard resetting link
Apr  1 03:47:32 lanstorage kernel: sas: ata16: end_device-2:7: dev error handler
Apr  1 03:47:32 lanstorage kernel: sas: sas_form_port: phy2 belongs to port6 already(1)!
Apr  1 03:47:32 lanstorage kernel: sas: sas_form_port: phy0 belongs to port4 already(1)!
Apr  1 03:47:34 lanstorage kernel: drivers/scsi/mvsas/mv_sas.c 1435:mvs_I_T_nexus_reset for device[0]:rc= 0
Apr  1 03:47:34 lanstorage kernel: drivers/scsi/mvsas/mv_sas.c 1435:mvs_I_T_nexus_reset for device[2]:rc= 0
Apr  1 03:47:34 lanstorage kernel: ata13.00: configured for UDMA/133
Apr  1 03:47:34 lanstorage kernel: ata13: EH complete
Apr  1 03:47:35 lanstorage kernel: ata15.00: configured for UDMA/133
Apr  1 03:47:35 lanstorage kernel: ata15: EH complete
Apr  1 03:47:35 lanstorage kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 3 tries: 1

There were also issues last parity check, but they didn't cause any sync errors:

 


Mar  1 05:34:38 lanstorage kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
Mar  1 05:34:38 lanstorage kernel: sas: trying to find task 0xffff8800cddf9600
Mar  1 05:34:38 lanstorage kernel: sas: sas_scsi_find_task: aborting task 0xffff8800cddf9600
Mar  1 05:34:38 lanstorage kernel: sas: sas_scsi_find_task: task 0xffff8800cddf9600 is aborted
Mar  1 05:34:38 lanstorage kernel: sas: sas_eh_handle_sas_errors: task 0xffff8800cddf9600 is aborted
Mar  1 05:34:38 lanstorage kernel: sas: ata9: end_device-2:0: cmd error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata9: end_device-2:0: dev error handler
Mar  1 05:34:38 lanstorage kernel: ata9.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar  1 05:34:38 lanstorage kernel: ata9.00: failed command: READ DMA EXT
Mar  1 05:34:38 lanstorage kernel: ata9.00: cmd 25/00:00:78:87:c9/00:04:98:00:00/e0 tag 29 dma 524288 in
Mar  1 05:34:38 lanstorage kernel:         res 40/00:00:1f:ef:a8/00:00:0d:00:00/ed Emask 0x4 (timeout)
Mar  1 05:34:38 lanstorage kernel: ata9.00: status: { DRDY }
Mar  1 05:34:38 lanstorage kernel: ata9: hard resetting link
Mar  1 05:34:38 lanstorage kernel: sas: ata10: end_device-2:1: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata11: end_device-2:2: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata12: end_device-2:3: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata13: end_device-2:4: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata14: end_device-2:5: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata15: end_device-2:6: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: ata16: end_device-2:7: dev error handler
Mar  1 05:34:38 lanstorage kernel: sas: sas_form_port: phy0 belongs to port0 already(1)!
Mar  1 05:34:40 lanstorage kernel: drivers/scsi/mvsas/mv_sas.c 1435:mvs_I_T_nexus_reset for device[0]:rc= 0
Mar  1 05:34:41 lanstorage kernel: ata9.00: configured for UDMA/133
Mar  1 05:34:41 lanstorage kernel: ata9: EH complete
Mar  1 05:34:41 lanstorage kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
Mar  1 12:01:21 lanstorage kernel: md: sync done. time=41479sec
Mar  1 12:01:21 lanstorage kernel: md: recovery thread: completion status: 0

 

Look for a bios update, If you have vt-d enable and don't need it disable it, try a different pcie slot if available and run another check, if errors persist consider replacing the controller.

 

You are right. There was a BIOS update and I've done it. It says that was for "Enhance compatibility with some PCIE devices.

I will run Parity check again and we will see.

Thanks again.

 

Link to comment
  • 7 months later...
On 4/1/2017 at 8:16 PM, johnnie.black said:

Hope it helps, and although only a few SASLP/SAS2LP users have those issues, it's a fairly common problem.

You were right. The problem was the controller. I replaced it with Dell PERC H310 0HV52W  and flash it in IT mode and parity check errors dissapeared.

Thanks again.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.