Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Yet More Drive Issues - Any ideas?

Featured Replies

Disk 2. Twice this week this drive has been disabled.

 

Both Diagnostics attached. Passes an extended SMART test, can't see any errors in the log apart from:

 

Jan 22 00:02:35 Tower kernel: sd 6:0:3:0: [sdl] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00

 

The first time it showed an irrationally high number of "Reads" as can be made out in the screenshot below (sorry about the red filter).

 

lOQejnS9XuRgyaEnmZBjc5MYwgDaqMdNG6fNqDb6

 

First time, whilst it was disabled I copied the emulated data to another drive, whilst running an extended SMART. On passing the SMART I re-built to the same drive (note there was little to rebuild as a moved it off).

 

2 evenings later the same disk goes down - albeit without the ridiculous number of reads.

 

Drive is attached to and LSI 8 port card. 3 other drives are on the same 4 sata to sas cable. Been like this for 6-8 weeks without any issues (on these drives/controller at least).

 

Expert Opinion please?

tower-diagnostics-20210122-1019.zip tower-diagnostics-20210120-0351.zip

  • Community Expert

Drive dropped offline both times, because of that there's no SMART report, but assuming it look good it's likely a connection problem, try swapping cables with another drive, on same or on different controller.

  • Author

@JorgeB Thanks for the reply, as always.

 

I haven't yet done what you suggested as the box is hard to get to and I have to empty a cupboard before even attempting to open it up. In the meantime I rebuilt to the same disk.

 

Alas it disabled itself again the next night!

 

However, what I've noticed is that all 3 of the error states have occurred at almost the same time...

 

20 JAN 01:17 - Tower: Alert [TOWER] - Disk 2 in error state (disk dsbl) SAMSUNG_HD103SJ_S2C8J9GZA00505 (sdl)

22 JAN 01:20 - Tower: Alert [TOWER] - Disk 2 in error state (disk dsbl) SAMSUNG_HD103SJ_S2C8J9GZA00505 (sdl)

24 JAN 01:22 - Tower: Alert [TOWER] - Disk 2 in error state (disk dsbl) SAMSUNG_HD103SJ_S2C8J9GZA00505 (sdl)

 

This can't be a coincidence. 

 

Log attached from the latest error. Any clues as to what the server is doing at this time every night?

 

Shouldn't be Mover, CA AppData Backup or SSD Trim. 

tower-diagnostics-20210125-1422.zip

  • Community Expert
6 minutes ago, air_marshall said:

what the server is doing at this time every night?

 

What do you get from the command line with this?

crontab -l

 

  • Author

Here is the Cron output

 

Linux 4.19.107-Unraid.
root@Tower:~# crontab -l
# If you don't want the output of a cron job mailed to you, you have to direct
# any output to /dev/null.  We'll do this here since these jobs should run
# properly on a newly installed system.  If a script fails, run-parts will
# mail a notice to root.
#
# Run the hourly, daily, weekly, and monthly cron jobs.
# Jobs that need different timing may be entered into the crontab as before,
# but most really don't need greater granularity than this.  If the exact
# times of the hourly, daily, weekly, and monthly cron jobs do not suit your
# needs, feel free to adjust them.
#
# Run hourly cron jobs at 47 minutes after the hour:
47 * * * * /usr/bin/run-parts /etc/cron.hourly 1> /dev/null
#
# Run daily cron jobs at 4:40 every day:
40 4 * * * /usr/bin/run-parts /etc/cron.daily 1> /dev/null
#
# Run weekly cron jobs at 4:30 on the first day of the week:
30 4 * * 0 /usr/bin/run-parts /etc/cron.weekly 1> /dev/null
#
# Run monthly cron jobs at 4:20 on the first day of the month:
20 4 1 * * /usr/bin/run-parts /etc/cron.monthly 1> /dev/null
0 1 * * 2 /usr/local/emhttp/plugins/ca.backup2/scripts/backup.php &>/dev/null 2>&1
root@Tower:~#

 

Any clues?

  • Community Expert
8 hours ago, trurl said:

 

What do you get from the command line with this?


crontab -l

 

Might you not also want the contents of the file /etc/cron.d/root  to see if that is running anything at those times?

  • Author
On 1/25/2021 at 10:45 PM, itimpi said:

Might you not also want the contents of the file /etc/cron.d/root  to see if that is running anything at those times?

 

# Generated docker monitoring schedule:
10 0 * * 1 /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/dockerupdate.php check &> /dev/null

# Generated system monitoring schedule:
*/1 * * * * /usr/local/emhttp/plugins/dynamix/scripts/monitor &> /dev/null

# Generated mover schedule:
40 3 * * 5 /usr/local/sbin/mover &> /dev/null

# Generated parity check schedule:
0 0 1 * * /usr/local/sbin/mdcmd check NOCORRECT &> /dev/null

# Generated plugins version check schedule:
10 0 * * 1 /usr/local/emhttp/plugins/dynamix.plugin.manager/scripts/plugincheck &> /dev/null

# Generated Unraid OS update check schedule:
11 0 * * 1 /usr/local/emhttp/plugins/dynamix.plugin.manager/scripts/unraidcheck &> /dev/null

# Generated cron settings for docker autoupdates
0 0 * * 0 /usr/local/emhttp/plugins/ca.update.applications/scripts/updateDocker.php >/dev/null 2>&1
# Generated cron settings for plugin autoupdates
0 0 * * * /usr/local/emhttp/plugins/ca.update.applications/scripts/updateApplications.php >/dev/null 2>&1

# CRON for CA background scanning of applications
0 * * * * php /usr/local/emhttp/plugins/community.applications/scripts/notices.php > /dev/null 2>&1

# Generated ssd trim schedule:
0 2 * * 1 /sbin/fstrim -a -v | logger &> /dev/null

# Generated system data collection schedule:
*/1 * * * * /usr/local/emhttp/plugins/dynamix.system.stats/scripts/sa1 1 1 &> /dev/null

 

Any clues?

  • Community Expert

Looks like the only thing scheduled to start around then is the CA Backup plug-in scheduled to start at 1:00 am.   That should not lead to your problem though unless it is triggering something non-obvious.

  • Community Expert
1 minute ago, itimpi said:

CA Backup plug-in scheduled to start at 1:00 am

On Tue, his syslog timestamps are on Wed

 

I always just go to corntab.com instead of trying to remember how to parse these.

 

  • 3 weeks later...
  • Author

Woes continue.

 

Firstly I re-manufactured the sata power cables to this bank of drives as I didn't like them. Connected it back up, did a short SMART test, rebuilt to same drive. That night same drive disabled at 01:40am.

 

Then I switched the SAS port it was on, rebuild to same disk, then that night the same drive disabled at 01:42am.

 

WTF is going on here. Happy to accept the drive might be bad despite passing SMART tests, but disabling itself at such consistent times I don't believe is just a co-incidence....

 

Latest diagnostics attached but I doubt it tell us anything new.

 

Shall i just give up and remove the drive?

tower-diagnostics-20210212-2340.zip

  • Community Expert

Nothing is assigned as disk1, is that expected?

 

Disk2 is disabled and doesn't appear to be connected since there is no SMART report for it.

  • Community Expert
8 hours ago, air_marshall said:

but disabling itself at such consistent times I don't believe is just a co-incidence....

You may have a ghost in the machine...

 

There are signs of the problem earlier:

 

Feb 12 00:01:28 Tower kernel: sd 10:0:2:0: attempting task abort! scmd(0000000003923bdb)
Feb 12 00:01:28 Tower kernel: sd 10:0:2:0: [sdk] tag#6275 CDB: opcode=0x85 85 09 0e 00 00 00 02 00 07 00 00 00 00 00 2f 00
Feb 12 00:01:28 Tower kernel: scsi target10:0:2: handle(0x000b), sas_address(0x4433221104000000), phy(4)
Feb 12 00:01:28 Tower kernel: scsi target10:0:2: enclosure logical id(0x500605b00991da10), slot(7)
Feb 12 00:01:32 Tower kernel: sd 10:0:2:0: task abort: SUCCESS scmd(0000000003923bdb)
Feb 12 00:02:08 Tower kernel: sd 10:0:2:0: device_block, handle(0x000b)
Feb 12 00:02:10 Tower kernel: sd 10:0:2:0: device_unblock and setting to running, handle(0x000b)
Feb 12 00:02:10 Tower kernel: sd 10:0:2:0: [sdk] Synchronizing SCSI cache
Feb 12 00:02:10 Tower kernel: sd 10:0:2:0: [sdk] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00
Feb 12 00:02:10 Tower kernel: mpt2sas_cm0: removing handle(0x000b), sas_addr(0x4433221104000000)
Feb 12 00:02:10 Tower kernel: mpt2sas_cm0: enclosure logical id(0x500605b00991da10), slot(7)

 

I would swap that disk with another from the onboard SATA controller to see if it changes anything.

  • Author
11 hours ago, trurl said:

Nothing is assigned as disk1, is that expected?

 

Disk2 is disabled and doesn't appear to be connected since there is no SMART report for it.

 

I had to drop disk1 because it failed a whilte ago and I shrank the array.

  • Author
5 hours ago, JorgeB said:

You may have a ghost in the machine...

 

There are signs of the problem earlier:

 


Feb 12 00:01:28 Tower kernel: sd 10:0:2:0: attempting task abort! scmd(0000000003923bdb)
Feb 12 00:01:28 Tower kernel: sd 10:0:2:0: [sdk] tag#6275 CDB: opcode=0x85 85 09 0e 00 00 00 02 00 07 00 00 00 00 00 2f 00
Feb 12 00:01:28 Tower kernel: scsi target10:0:2: handle(0x000b), sas_address(0x4433221104000000), phy(4)
Feb 12 00:01:28 Tower kernel: scsi target10:0:2: enclosure logical id(0x500605b00991da10), slot(7)
Feb 12 00:01:32 Tower kernel: sd 10:0:2:0: task abort: SUCCESS scmd(0000000003923bdb)
Feb 12 00:02:08 Tower kernel: sd 10:0:2:0: device_block, handle(0x000b)
Feb 12 00:02:10 Tower kernel: sd 10:0:2:0: device_unblock and setting to running, handle(0x000b)
Feb 12 00:02:10 Tower kernel: sd 10:0:2:0: [sdk] Synchronizing SCSI cache
Feb 12 00:02:10 Tower kernel: sd 10:0:2:0: [sdk] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=0x00
Feb 12 00:02:10 Tower kernel: mpt2sas_cm0: removing handle(0x000b), sas_addr(0x4433221104000000)
Feb 12 00:02:10 Tower kernel: mpt2sas_cm0: enclosure logical id(0x500605b00991da10), slot(7)

 

I would swap that disk with another from the onboard SATA controller to see if it changes anything.

 

Thanks again @JorgeB, given time and case constraints I'll shrink the array for now and investigate further at another time.

 

PITA, that'll be 3 drives I've had to drop in as many months since my re-casing project 😞

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.