unRAID Server Release 4.5.4 Available


Recommended Posts

and now there is talk about another stable without any public beta testing.

 

We should learn from the need to even have had a "4.5.3-TEST".

 

Use the community at your disposal. More betas, less stables

I agree, get rid of this Microsoft mentality of releasing what should be a beta version as a full version. It would be nice to see a new version posted as beta for a month or so and then after there is confidence that any critical bugs have been ironed out and fixed, release it as final.

 

Patches to released versions generally are for isolated bug fixes or minor additions such as drivers for different hardware.  In these cases I almost always am working with the person(s) who request the change.

 

4.5.2 was a bit special in that I had to quickly generate a release to support h/w needed to ship in servers.

 

The -TEST version was also a special case, released to get help from the Community to solve that format problem.

 

That makes sense but to my eye there are two district usegroups here. People like us that always stay up to date and others (which are likely the majority) that update far less frequently and only to stable versions.

 

I still say release everything as a Beta or RC then periodically re-release one with no changes as a stable once it has proven worthy.

 

If you look at the last few stables with an impartial eye there were some comparatively big changes in them like Samba and kernel updates that were released as stable without any community testing. With such a rag tag collection of hardware its the safest way.

 

In essence some of this is semantics but to me at least it is the sensible option.

 

It also means you have far fewer stables to officially support.

 

Link to comment
  • Replies 86
  • Created
  • Last Reply

Top Posters In This Topic

I still say release everything as a Beta or RC then periodically re-release one with no changes as a stable once it has proven worthy.

 

I'm with NAS on this one.  From my limited perspective, it seems that the whole 'unformatted bug' fiasco with 4.5.3 could have been avoided with the proper use of beta versions.  At least one user lost data because of the bug, and that's one too many.  I understand the Adaptec to SuperMicro hardware change forced your hand a bit in releasing a new version quickly, but as I understand it that change was unrelated to the unformatted bug (please correct me if I'm wrong here).

 

On a separate note, I would love to be a part of the alpha testing, and I do have two test servers I could throw at it, but I'll be traveling for the next month so I expect I'll miss out on that.  If it is still going around mid-July, I'll see if I can pitch in at that time.

Link to comment

Hello All,

Since there is no 4.5.x thread in the support section I am posting here.

I noticed on my console some errors suggesting me to run FSCK this morning.  So following the wiki I ran "reiserfsck --check /dev/md1"

My FSCK return was:

Trans replayed: mountid 20, transid 66029, desc 428, len 9, commit 438, next trans offset 421

Trans replayed: mountid 20, transid 66030, desc 439, len 9, commit 449, next trans offset 432

Trans replayed: mountid 20, transid 66031, desc 450, len 10, commit 461, next trans offset 444

Trans replayed: mountid 20, transid 66032, desc 462, len 14, commit 477, next trans offset 460

Trans replayed: mountid 20, transid 66033, desc 478, len 1, commit 480, next trans offset 463

Trans replayed: mountid 20, transid 66034, desc 481, len 9, commit 491, next trans offset 474

Trans replayed: mountid 20, transid 66035, desc 492, len 71, commit 564, next trans offset 547

Trans replayed: mountid 20, transid 66036, desc 565, len 14, commit 580, next trans offset 563

Reiserfs journal '/dev/md15' in blocks [18..8211]: 618 transactions replayed

Checking internal tree../  1 (of  6)/ 13 (of 170)/144 (of 170)block 38887627: The level of the node (0) is not correct                                , (1) expected

the problem in the internal node occured (38887627), whole subtree is skipped finished                              )

Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs.

Bad nodes were found, Semantic pass skipped

1 found corruptions can be fixed only when running with --rebuild-tree

###########

reiserfsck finished at Mon Jun 14 08:02:01 2010

###########

root@Tower:~#

 

 

So since it is asking me to run the "--rebuild-tree" switch I am asking for advice here.

If it matters at all I did not reboot the box before running FSCK.

syslog is too large to attach so I have posted it at pastebin:  http://pastebin.com/EqUQuUF1

Link to comment

Thanks for the quick reply.  Is there any risk of data loss by running this?  Would it be safer just to RMA the drive?

It has nothing to do with the drive itself.  It has to do with a corrupted file-system on it.

 

Have you gotten a smartctl report on the drive?  That would let you know of its health. 

Joe L.

Link to comment

I didn't look previously.  Im not 100% keen on how to read all he sections here.  I do see some errors with the on time.

 

root@Tower:/dev# smartctl -d ata -a /dev/sdq

smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

 

=== START OF INFORMATION SECTION ===

Device Model:    ST31500341AS

Serial Number:    9VS1R38D

Firmware Version: CC1H

User Capacity:    1,500,301,910,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  8

ATA Standard is:  ATA-8-ACS revision 4

Local Time is:    Mon Jun 14 09:10:16 2010 GMT+8

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x82) Offline data collection activity

                                        was completed without error.

                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                ( 609) seconds.

Offline data collection

capabilities:                    (0x7b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  1) minutes.

Extended self-test routine

recommended polling time:        ( 255) minutes.

Conveyance self-test routine

recommended polling time:        (  2) minutes.

SCT capabilities:              (0x103f) SCT Status supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000f  105  095  006    Pre-fail  Always      -      8067228

  3 Spin_Up_Time            0x0003  096  092  000    Pre-fail  Always      -      0

  4 Start_Stop_Count        0x0032  100  100  020    Old_age  Always      -      56

  5 Reallocated_Sector_Ct  0x0033  100  100  036    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x000f  061  060  030    Pre-fail  Always      -      1367503

  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -      896

10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  020    Old_age  Always      -      30

184 Unknown_Attribute      0x0032  100  100  099    Old_age  Always      -      0

187 Reported_Uncorrect      0x0032  001  001  000    Old_age  Always      -      156

188 Unknown_Attribute      0x0032  100  100  000    Old_age  Always      -      0

189 High_Fly_Writes        0x003a  041  041  000    Old_age  Always      -      59

190 Airflow_Temperature_Cel 0x0022  070  068  045    Old_age  Always      -      30 (Lifetime Min/Ma                                                                                                                                  x 22/31)

194 Temperature_Celsius    0x0022  030  040  000    Old_age  Always      -      30 (0 20 0 0)

195 Hardware_ECC_Recovered  0x001a  040  033  000    Old_age  Always      -      8067228

197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0

240 Head_Flying_Hours      0x0000  100  253  000    Old_age  Offline      -      38001870635228

241 Unknown_Attribute      0x0000  100  253  000    Old_age  Offline      -      3759789166

242 Unknown_Attribute      0x0000  100  253  000    Old_age  Offline      -      3403232947

 

SMART Error Log Version: 1

ATA Error Count: 156 (device log contains only the most recent five errors)

        CR = Command Register [HEX]

        FR = Features Register [HEX]

        SC = Sector Count Register [HEX]

        SN = Sector Number Register [HEX]

        CL = Cylinder Low Register [HEX]

        CH = Cylinder High Register [HEX]

        DH = Device/Head Register [HEX]

        DC = Device Command Register [HEX]

        ER = Error register [HEX]

        ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

 

Error 156 occurred at disk power-on lifetime: 175 hours (7 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  25 00 00 ff ff ff ef 00  1d+10:45:08.277  READ DMA EXT

  27 00 00 00 00 00 e0 00  1d+10:45:08.248  READ NATIVE MAX ADDRESS EXT

  ec 00 00 00 00 00 a0 00  1d+10:45:08.228  IDENTIFY DEVICE

  ef 03 46 00 00 00 a0 00  1d+10:45:08.209  SET FEATURES [set transfer mode]

  27 00 00 00 00 00 e0 00  1d+10:45:08.047  READ NATIVE MAX ADDRESS EXT

 

Error 155 occurred at disk power-on lifetime: 175 hours (7 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  25 00 00 ff ff ff ef 00  1d+10:45:05.291  READ DMA EXT

  27 00 00 00 00 00 e0 00  1d+10:45:05.261  READ NATIVE MAX ADDRESS EXT

  ec 00 00 00 00 00 a0 00  1d+10:45:05.241  IDENTIFY DEVICE

  ef 03 46 00 00 00 a0 00  1d+10:45:05.222  SET FEATURES [set transfer mode]

  27 00 00 00 00 00 e0 00  1d+10:45:05.131  READ NATIVE MAX ADDRESS EXT

 

Error 154 occurred at disk power-on lifetime: 175 hours (7 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  25 00 00 ff ff ff ef 00  1d+10:45:02.284  READ DMA EXT

  27 00 00 00 00 00 e0 00  1d+10:45:02.255  READ NATIVE MAX ADDRESS EXT

  ec 00 00 00 00 00 a0 00  1d+10:45:02.235  IDENTIFY DEVICE

  ef 03 46 00 00 00 a0 00  1d+10:45:02.215  SET FEATURES [set transfer mode]

  27 00 00 00 00 00 e0 00  1d+10:45:02.054  READ NATIVE MAX ADDRESS EXT

 

Error 153 occurred at disk power-on lifetime: 175 hours (7 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  25 00 00 ff ff ff ef 00  1d+10:44:59.277  READ DMA EXT

  27 00 00 00 00 00 e0 00  1d+10:44:59.248  READ NATIVE MAX ADDRESS EXT

  ec 00 00 00 00 00 a0 00  1d+10:44:59.228  IDENTIFY DEVICE

  ef 03 46 00 00 00 a0 00  1d+10:44:59.209  SET FEATURES [set transfer mode]

  27 00 00 00 00 00 e0 00  1d+10:44:59.128  READ NATIVE MAX ADDRESS EXT

 

Error 152 occurred at disk power-on lifetime: 175 hours (7 days + 7 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  25 00 00 ff ff ff ef 00  1d+10:44:56.281  READ DMA EXT

  27 00 00 00 00 00 e0 00  1d+10:44:56.252  READ NATIVE MAX ADDRESS EXT

  ec 00 00 00 00 00 a0 00  1d+10:44:56.232  IDENTIFY DEVICE

  ef 03 46 00 00 00 a0 00  1d+10:44:56.212  SET FEATURES [set transfer mode]

  27 00 00 00 00 00 e0 00  1d+10:44:56.052  READ NATIVE MAX ADDRESS EXT

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

root@Tower:/dev#

 

Link to comment

As far as data loss.  The file tree is unable to get to all the data now.  It might be referencing blocks of data from files you deleted, or it might be files you just don'y know are missing in the directory listings. (or they are listed, but their contents incomplete)

 

With any luck, running the reiserfsck will fix things.

 

Other than that, looking in your log file, were you plugging and un-plugging the LAN cable?  It shows the link being lost again and again.

#

Jun 13 12:52:41 Tower kernel: r8169: eth0: link down

Jun 13 12:52:42 Tower ifplugd(eth0)[1517]: Link beat lost.

Jun 13 12:52:42 Tower kernel: r8169: eth0: link up

Jun 13 12:52:43 Tower ifplugd(eth0)[1517]: Link beat detected.

Jun 13 12:52:57 Tower kernel: r8169: eth0: link down

Jun 13 12:52:58 Tower ifplugd(eth0)[1517]: Link beat lost.

Jun 13 12:53:01 Tower kernel: r8169: eth0: link up

Jun 13 12:53:01 Tower ifplugd(eth0)[1517]: Link beat detected.

Jun 13 12:57:35 Tower kernel: mdcmd (18155): spindown 13

Jun 13 12:57:56 Tower kernel: mdcmd (18159): spindown 6

Jun 13 12:57:56 Tower kernel: mdcmd (18160): spindown 8

Jun 13 13:05:29 Tower kernel: mdcmd (18207): spindown 5

Jun 13 13:05:46 Tower kernel: r8169: eth0: link down

Jun 13 13:05:47 Tower ifplugd(eth0)[1517]: Link beat lost.

Jun 13 13:05:49 Tower kernel: r8169: eth0: link up

Jun 13 13:05:50 Tower ifplugd(eth0)[1517]: Link beat detected.

Jun 13 13:06:02 Tower kernel: r8169: eth0: link down

Jun 13 13:06:03 Tower ifplugd(eth0)[1517]: Link beat lost.

Jun 13 13:06:04 Tower kernel: r8169: eth0: link up

Link to comment

On the NIC question, yes.  I was updating the fw on the router the unRAID is connected to.

 

I am currently running the reiserfsck --rebuild-tree now and will then run another --check to see if it fixed the issue.  I will report results when complete.

 

Update:  Its now completed.  Here are the results.  It looks like it did fix it.

 

root@Tower:~# reiserfsck --check /dev/md15

reiserfsck 3.6.19 (2003 www.namesys.com)

 

*************************************************************

** If you are using the latest reiserfsprogs and  it fails **

** please  email bug reports to [email protected], **

** providing  as  much  information  as  possible --  your **

** hardware,  kernel,  patches,  settings,  all reiserfsck **

** messages  (including version),  the reiserfsck logfile, **

** check  the  syslog file  for  any  related information. **

** If you would like advice on using this program, support **

** is available  for $25 at  www.namesys.com/support.html. **

*************************************************************

 

Will read-only check consistency of the filesystem on /dev/md15

Will put log info to 'stdout'

 

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

###########

reiserfsck --check started at Mon Jun 14 12:53:19 2010

###########

Replaying journal..

Reiserfs journal '/dev/md15' in blocks [18..8211]: 0 transactions replayed

Checking internal tree..finished

Comparing bitmaps..finished

Checking Semantic tree:

finished

No corruptions found

There are on the filesystem:

        Leaves 159504

        Internal nodes 970

        Directories 82

        Other files 138

        Data block pointers 161395625 (0 of them are zero)

        Safe links 0

###########

reiserfsck finished at Mon Jun 14 13:41:11 2010

###########

root@Tower:~#

 

 

 

Thanks very much for your time Joe!!

Link to comment
  • 4 months later...

Wow.  searching my email shows its a MD-1500/LL that I bought in 2007. 

 

I think the problem is that the system image size has increased beyond a certain value, and the 'ldlinux.sys' file on your flash (from 2007) can not load it correctly.

 

After shutting down your server, you need to remove Flash and plug into your PC.  Now backup the contents of your 'config' directory, e.g., drag to windows desktop.

 

Next download the version of syslinux found here:

http://lime-technology.com/download/cat_view/55-utilities

 

Now follow instructions for installing the latest release found here:

http://lime-technology.com/support/unraid-server-installation

 

Finally, restore the 'config' directory to the flash.

 

If this is unclear, then send me an email: [email protected].

 

 

 

 

OK.  so its been awhile ... over 5 months...  :)

 

Looks like this was it.  My heart skipped a beat when it said access to the flash drive was denied.  But another thread said start syslinux with admin and that let me do it.  It might be good idea to put the gotcha issue with win7 on the server installation page that's talking about how to use syslinux.

 

 

I'm not sure what the latest stable is.  There is a thread saying 4.5.8 is available but 4.5.6 was the one the main page let me download so that is what I've done for now.  It looks like there is a 4.6 going stable shortly if I read that right.

 

At any rate.  Thanks.  The system booted this time taking me from ver 4.5 to 4.5.6 and started a Parity-Check.  The raid is up and I can access my files.  Thanks for the info about syslinux and the likely issue with the system image size from my older 2007 files.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.