Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Performance and Stability issues [SOLVED]

Featured Replies

Hey guys -

 

I'm a newbie to unraid.  Just set up a new server about a week ago with the following specs:

 

Asrock Q1900-ITX motherboard/cpu

8GB ram

Parity disk: Seagate Barracuda 3TB ST3000DM001

Disk 1: Seagate Barracude 2TB ST2000DL003

Disk 2:  Samsung Spinpoint F1 1TB HD103UJ

 

I'm having some strange issues.  Random reboots every few days.  Nothing in the syslog for it except a notification that disks 1 and 0 are spinning down.

 

    Nov 10 03:00:32 Myklene kernel: mdcmd (27): spindown 0

    Nov 10 03:00:32 Myklene kernel: mdcmd (28): spindown 1

 

When doing a parity check, for a while the check runs pretty fast (100+ MB/sec), but then drops to about 8 MB/sec and I start seeing a ton of writes to the parity disk.  No errors, just many, many writes.

 

Any insight or ideas of how I can further troubleshoot?  I was thinking of maybe trying a different sata controller, but don't want to shell out the $$ on a guess.

 

UPDATE

SO - 1 replaced hard drive and 2 motherboards later I finally figured out the issue.  Turns out my PSU was crapping out.  It finally died completely.  New PSU and I'm back in business.  While I was at it, I precleared the parity drive, precleared and put in a new WD-Red as the (currently) single data drive and removed the other two old drives -- one of them I'll probably put through a preclear and see what SMART has to say; if it looks good I'll add it back into the array.  My guess is that the FS errors were introduced by a crash (due to failing PSU) during a data write.

 

Thanks for the help everyone!

 

-Justin

Post a syslog and smart report for the drives in question.

  • Author

Any specific time you want me to post a syslog from?  On the last unexpected reboot, the only thing I saw in syslog was spin down on drives 0 and 1.

 

SMART reports are attached.

 

sdb.txt = Parity

sdc.txt = disk1

sda.txt = disk2

sdb.txt

sda.txt

sdc.txt

Whatever is available from before the last crash I suppose.

It gives people a picture of your hardware.

  • Author

Parity check is currently running (9.39 MB/sec, 7% done).

 

as I continue to do SMART reports on sdb I'm watching the raw_read_error_rate continue to rise on both samsung drives (sdb and sdc)

 

sdb was at 60936288 when I first pulled it 30 min ago -- it is now at 128554080

sdc was at 162892136.  Now it is 210346840

 

I know raw_error_rate isn't supposed to necessarily indicate a problem but these values look to be going through the roof.

 

I don't have the syslog from before last crash, but here's since last crash.  Huh, I definitely didn't notice this before but I see a lot of errors in there like this:

 

Nov 10 12:35:48 Myklene kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [605 45099 0x0 SD]

Nov 10 12:35:48 Myklene kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 189497997. Fsck?

 

syslog.zip

sda has this value. Monitor it for increasing amounts.

It indicates a CRC error in the communication path. I.E. Potential cable or removable drive bay issue.

 

199 UDMA_CRC_Error_Count    0x003e  100  100  000    Old_age  Always      -      2360

 

The other drives look OK, albeit a bit old.

 

Did you pre-clear and verify the parity drive?

Describe your hardware more. I.E. PSU CPU, ram brand, etc, etc.

 

I see that none of these drives have had the smart long test.

I like to have people run that as it puts a line in the sand or log record in the smart data as to when the surface was last verified.

Keep in mind this will take many hours, You will need to stop the array and disable any spin down timers.

 

You can trigger all 3 smart long tests at the same time.

It happens in each of the machines firmware.

Then leave the machine alone for the amount of time for the longest smart long test.

 

Then capture another smart log and tuck it away.

 

look for pending sectors or other abnormal attributes.

Nov 10 12:35:48 Myklene kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [605 45099 0x0 SD]

Nov 10 12:35:48 Myklene kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 189497997. Fsck?

 

 

This isn't good, it could indicate bad sectors on the drive, but since the drive was never really scanned by SMART who knows what's up with the surface vs the higher level format.  Seems like the drives themselves might need the smart long test for confidence then a reiserfsck on the high level format to make sure everything is ok.

  • Author

ok, kicked off the SMART long tests.  Looks like they'll be done in about 6 hours; I'll post them when they are complete.

 

As far as other detailed hardware specs, here you go:

 

Motherboard: Asrock Q1900-ITX

CPU: Intel J1900

Memory: 2x 4GB PC3-1333 (don't know manufacturer, will have to look when I get home from work)

Antec 550W PSU

 

Nothing else special I can think of, no hot-swap bays or anything on the drives.

 

Not sure if I pre-cleared and verified the parity drive.  I just put the OS on the flash disk, booted it up, logged in and configured the drives;  IIRC, it took several hours to do its thing before I was able to bring the array online.

 

IIRC, it took several hours to do its thing before I was able to bring the array online.

 

unRAID did it's own internal blind clear probably, However, I'm not sure it does that to the parity drive.

In any case, the drive with the UDMA CRC's. If that number increases, check or change the cable.

So far the SMART logs do not show anything, let's see after a scan.

The REISERFS format errors do have me concerned. Until they go away, I would not consider the drives reliable.

I.E. do not store data on those drives yet, or if you have, try to capture that data or insure you have a backup.

  • Author

OK, I ran the long SMART tests.  They are attached.  I am now getting these REISERFS errors constantly.  I have also attached syslog since my last reboot (I manually rebooted this morning so I could change out the sata cable on sdb).

 

[EDIT] - I forgot to take the array offline before running the long test.  Running again now...

 

Changing out the cable on sdb didn't seem to have any effect.

 

-Justin

 

sda_long.txt

sdb_long.txt

sdc_long.txt

syslog.zip

OK, I ran the long SMART tests.  They are attached.  I am now getting these REISERFS errors constantly.

Reiserfs errors indicate that there is some file system corruption at the reiserfs level on the drive(s).  To fix this you need to run reiserfsck against the drive while in maintenance mode.

Might be better to just start over and preclear all drives before trusting them.

  • Author

OK, here's the smart reports after long test.  Other than old-age, anything that indicates problems with any of these?

 

sdb= Parity

sdc = disk1

sda = disk2

sda_long.txt

sdb_long.txt

sdc_long.txt

OK, here's the smart reports after long test.  Other than old-age, anything that indicates problems with any of these?

 

sdb= Parity

sdc = disk1

sda = disk2

Those SMART reports look fine.

  • Author

OK, here's the smart reports after long test.  Other than old-age, anything that indicates problems with any of these?

 

sdb= Parity

sdc = disk1

sda = disk2

Those SMART reports look fine.

 

Thanks  itimpi.  So I'm stumped then.  Given the storm of REISERFS errors, think perhaps I have a bad drive controller?

Thanks  itimpi.  So I'm stumped then.  Given the storm of REISERFS errors, think perhaps I have a bad drive controller?

Once you get a reiserfs error, it is going to continue occurring until you have successfully completed a reiserfsck run to fix it.

If you haven't done already and there isn't any data on the disks, follow the earlier advice a run at least one preclear run on all drives..

 

With screen you can run all at once.

  • Author

My plan is to drop a new 3TB WD Red in there (coming tomorrow).  I'll preclear it, and run with just that and the parity drive and see how that goes.  Will report back.

 

Thanks everyone!

  • 2 weeks later...
  • Author

First post updated with solution.

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.