sully

Members
  • Posts

    13
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

sully's Achievements

Noob

Noob (1/14)

0

Reputation

  1. I did just about everything I could think of, troubleshooting wise, and finally started playing with the memory timings and MemTest. Scaled it back to the minimum the board was advertised to do. Ran the tests for a day or so, then it booted up and everything was perfect. For a short while... The power went out about two weeks later and somehow, even though I invoked 'shutdown' form the command line (I had a UPS after all, just not setup in the GUI), it zapped the array config on the flash drive. Messed up all the settings and the disks ended up back in a different order. Not a huge deal, but I had planned on managing the disks/shares in a different way. My fault, I suppose, as I hadn't remembered to update my data on which disk was which when I swapped MB's. All my diagrams/labels were from the prior hardware config. Before I gave up on the original config, I did run tests on the flash to see if it had any file system issues, but all came up clear. Now it appears one of my brand new iStarUSA backplanes has precipitated a read/write error which has flagged a disk as faulty. Since I don't have any spare at the moment and I'm about done with computer BS in general, I've shutdown the system till I feel like dealing with it, or I see a good deal on 3TB HD's... i need one to replace the supposed faulty disk and a second for the dual parity. I'll also replace the backplane with the spare I do have and RMA the one that's giving me problems. But since I need to decide on what disks and at what price (who knows how long), order the disks (up to a week for shipping), preclear them (3-4 days), then rebuild the array (who knows how long, since I've not done it before), I'm just not in the mood anymore. It's my luck that I'll start having other HD errors on other machines, then it'll be a perfect storm of failure for my UnRAID plan... Perhaps I was too ambitious... Should have stuck with building it in one of the HP Microservers with 6 disks instead of going for 16, but that just wasn't the density I was after with dual parity.
  2. I think it needs the -t time (for which I picked 'now') at minimum... I recall typing shutdown and just got the same as if I'd typed Shutdown -?... I'm just wondering if I did something else wrong. With the UnRAID array started, do I need to do something more than just 'shutdown -t time -h -P' to ensure that the array is stopped properly, so I can hit the power switch? As I said, this would be for an emergency more than anything, I'll setup the network infrastructure to run from the powered ports on the nearest UPS so I can at least login and shutdown (assuming the power goes out when I'm home/awake) and I'll let the UnRAID UPS interface deal with it otherwise... (You were right about UnRAID just 'finding' the UPS, I was worried b/c it's a CyberPower rather than an APC ) Anyway, thanks for all the help! Sully.
  3. Thanks for the advice. Ran CHKDSK on the flash and no errors. The super.dat file is zero bytes for some reason, so that's the likely culprit. I've identified the Parity. It was the one I thought it was. Assuming the cache drives are correctly identified (I think they are, b/c they show the 500 GB that Mover hadn't run thru yet). Now I have an invalid configuration and I'm supposing that I just rest the array under New Config? If so, I should be back in business. Please confirm! I would suppose the shares are maintained under a New Config as they are a part of the array? Found this in relation to the above question on 'New Config'. https://lime-technology.com/forum/index.php?topic=47504.msg479489#msg479489 Array is back up and running. I'll be doing a parity check but I still have the below question on command like shutdown. BTW, I have labels on the physical drive trays for each disk, with the SN and the logical address... But for the old system! To be honest, I was just happy that everything was working and hadn't gotten around to doing new ones... So I'll be using the screenshot/print out to create new labels for the new MB/Configuration. Last question... While I'll add the routers/switches to the UPS side of the power equation (which will allow me to remote in to shutdown the array and power down -- assuming that I don't just let the UPS Settings deal with it)... What is the proper command line to shutdown a running UnRAID array. For example, I login as Root and I'm at the command prompt? What I did to apparently mess this up was invoke >Shutdown -h -P now What should have been done? Thanks! Sully.
  4. It's been a bumpy few months with UnRAID. Finally pulled the trigger to build my first server, I had everything setup and going well, when the old re-purposed MB I was using failed and I decided to get a replacement. The new MB wouldn't boot UnRAID and I posted a few questions to aid in the trouble shooting (which didn't get any response), but in the end I figured it out. The upgraded system fired to life about two weeks ago and everything has been great, until last night, when the power went out. The UPS kept the system alive, but I wasn't able to remote access UnRAID to stop the array. Not knowing how long the power would be out, I figured I should shut the system down, so I invoked shutdown from the command line. I'm thinking this wasn't the correct procedure, because when I went to start the machine back up this afternoon, none of the Disks show up in the configuration. It seems to have decided that two of the Disks are my Cache, but it's asking me to repopulate the list of disks in the array. When the power went out, I wasn't transferring data to or from the array, but the cache disks had about 500 gb of data set for the next 'mover' operation. From what I gather from the FAQ/Documentation, this may not be a big deal, but I'm just curious: What's my next step? 1. Do I just assign the Disk I think is the Parity and the remaining to the array, hit Start and see what happens? 2. Is this a time when I would start the array in Maintenance Mode? 3. If the array disks end up in different positions, how does this affect the shares I've setup and their inclusion/exclusion rules? I'm running 6.1, so just the one parity disk to locate, then assigning the remaining 12 data disks. I've not run across this issue before, as after solving the boot issue, UnRAID figured it all out on it's own, (i.e., populated the disks in the array where they were with the original [failed] MB) and after starting the array and letting the parity check complete, I figured it would be fine moving forward. I'm not sure if I'm using the right search terms, but I didn't find this question in the forums or with a google search. Any help appreciated! Sully. P.S. I know that UnRAID can work with the UPS to shut the system down, but the machine isn't in its "final resting place" and so I haven't fully configured the rig. Baby steps. I also haven't got any dockers or any of the other fun stuff going yet. I'd like to get comfortable / reacquainted with linux/UNIX before I dive in too deep.
  5. Have tried several more things to resolve this issue, so I can boot into UnRAID. 1. I've changed the UEFI BIOS Setting for the Flash Drive from UEFI to Legacy (the old board wasn't UEFI). 2. I've added the remaining hardware from the old machine (4 SYBA x4 SATA 3 Expansion Cards) and connected the Backplanes w/the 3TB Drive installed. After completing #1, I still got the Kernel Panic message with the Init issues. After completing #2, I didn't see any change in behavior. Here's a pic of the end result (where the system hangs)... Looking for some advice... At this point, I'd even accept someone telling me I'm a Muppet and should be able to figure this out on my own. Thanks! Sully.
  6. Looking a little harder at the video I made of one of the boot sequences, I think there may also be an error related to the /init. "Failed to execute /init (error -2)" "Kernel panic - not syncing: no working init found. Try passing init= option to kernel. See Linux Documentation/init.txt for more information" Anyway thanks for reading... Sully
  7. So... I pulled the trigger with UnRAID based on the impending release of 6.2 (for the dual parity). Setup a system under 6.1 (stable) using an old motherboard. I've not used Unix/Linux for a good few years (15+) and this was going to let me ease back into what I had enjoyed using in College. Everything was going well, I was getting comfortable with it, was about to see about extensions and dockers, but then something went awry. Board (or UnRAID) would just stop at night during or after the Mover operation. This was about two-three weeks into having the rig setup. Troubleshooting of the issues inferred to me that it was the board (it failed to post with the disks unplugged and also with memory removed). TBH, I started with hardware and when I couldn't get it to work, I decided to replace the board and get something with SATA3 native and PCIe v2 and v3 for expansion. So if there is something wrong with the underlying UnRAID software setup then I don't know. Ended up with a Gigabyte F2A88X-D3HP with an A10-7860K. Another confession, I didn't check to see if this was on the blacklist (if there is one) for UnRAID, I figured the software is hardware agnostic and this wasn't that new a MB (I've googled the MB and searched the forums, at least one user has been successful in getting UnRAID setup with this board [of course they claimed they had no problems :-) ]). So, a few items, that I'm wondering might solve my problem... 1. I've enabled IOMMU, I think this was the cause of the Kernel panic when I first tried to boot. 2. I've not attached the Disks yet. I'm thinking this may be the cause of the subsequent problem(s), but again, I'm out of my depth RE: troubleshooting Linux. 3. If I've understood the error message correctly this is the system's current gripe (or at least this is the last gasp from the Kernel before it halts): 'not syncing: VFS: Unable to mount root fs on unknown-block(0,0)' This seemed weird to me, as I thought UnRAID lived it's life completely off of the USB, so having the drives attached or not, shouldn't make a difference. Or did I miss something, since this was a configured system with 12 data disks, two cache drives and one parity (with room for the 2nd). I've gotten a different error messages on a prior boot, just after I changed the IOMMU setting in the BIOS, one mentioning something about looking at "linux documentation/init.txt for guidance". All seem to point to the USB being angry that the system has changed and that it might need some configuration/settings adjustment that is, quite frankly, beyond me at present. Also, I have no idea how to capture the boot sequence to disk or screen capture to share here, so if that's the next step, please be kind and provide some advice on how to complete that procedure. Much thanks! Sully.
  8. Are the preconfigured USB drives not available anymore? I don't want to mess around with picking a USB drive and finding the ID, etc. And was just going to buy a preconfigured setup when 6.2 is stable (for the dual parity support). But the page is gone as of this Monday when I restarted the browser. Also, there used to be an option to buy a second licence at the same time for a subtle discount. Is that not on anymore either? I was thinking about re-purposing another box to be a backup to the backup server, but if the the buy-one get a discount isn't an option any more, I may just run WHS or something else I've got lying around. Any info appreciated, thanks!
  9. I have 19 of these in various systems and on the last 17 of them I've done preclear and this is the first one that's flagged a change in the Throughput_Performance value. The most recent 12 or so drives (including this one) have been precleared with the same unRAID config / preclear version on the same hardware (an HP N40L) Is this a drive I should consider returning to Newegg for a replacement, or is it something else going on? --------------------------------------------------------------------------------------------------------- ========================================================================1.13 == invoked as: ./preclear_disk.sh -c 3 -r 131027 -w 131027 -b 2000 -m [email protected] -M 4 /dev/sdc == TOSHIBA DT01ACA300 == Disk /dev/sdc has been successfully precleared == with a starting sector of 1 == Ran 3 cycles == == Using :Read block size = 131027 Bytes == Last Cycle's Pre Read Time : 7:24:31 (112 MB/s) == Last Cycle's Zeroing time : 11:17:38 (73 MB/s) == Last Cycle's Post Read Time : 16:08:20 (51 MB/s) == Last Cycle's Total Time : 27:27:05 == == Total Elapsed Time 90:15:22 == == Disk Start Temperature: 29C == == Current Disk Temperature: 36C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdc /tmp/smart_finish_sdc ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Throughput_Performance = 141 100 54 ok 66 Temperature_Celsius = 166 206 0 ok 36 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 3. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 3. 0 sectors were pending re-allocation after post-read in cycle 1 of 3. 0 sectors were pending re-allocation after zero of disk in cycle 2 of 3. 0 sectors were pending re-allocation after post-read in cycle 2 of 3. 0 sectors were pending re-allocation after zero of disk in cycle 3 of 3. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ============================================================================ == == S.M.A.R.T Initial Report for /dev/sdc == Disk: /dev/sdc smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: TOSHIBA DT01ACA300 Serial Number: Firmware Version: MX6OABB0 User Capacity: 3,000,592,982,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Wed Mar 16 21:15:48 2016 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x80) Offline data collection activity was never started. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (20931) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 100 100 054 Pre-fail Offline - 0 3 Spin_Up_Time 0x0007 100 100 024 Pre-fail Always - 0 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 1 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 020 Pre-fail Offline - 0 9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 0 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 1 194 Temperature_Celsius 0x0002 206 206 000 Old_age Always - 29 (Min/Max 22/29) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ ============================================================================ == == S.M.A.R.T Final Report for /dev/sdc == Disk: /dev/sdc smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: TOSHIBA DT01ACA300 Serial Number: Firmware Version: MX6OABB0 User Capacity: 3,000,592,982,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Sun Mar 20 15:31:10 2016 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (20931) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 141 141 054 Pre-fail Offline - 66 3 Spin_Up_Time 0x0007 100 100 024 Pre-fail Always - 0 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 1 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 020 Pre-fail Offline - 0 9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 90 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 1 194 Temperature_Celsius 0x0002 166 166 000 Old_age Always - 36 (Min/Max 22/37) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================
  10. Thank you both. Just the kind of straight answers I was hoping for... Awesome...
  11. I appreciate the response, but it doesn't address any of my questions. I'm aware that unRAID serves as a wonderful platform with single parity. However, I want to deploy a dual parity server and it is my understanding that after years of requests, a version of unRAID with dual parity is in the works. What I want to know is, what discussion, if any has there been about the migration from single to dual parity. I've done my Google research and similar searches on the message boards here and haven't found what I'm looking for. If there hasn't been any discussion, or if there is just no info, then I'm fine with that answer. I may well decide to build a v6.xx box, as I'd like to think that the development will include a way to deploy the feature to legacy users and so I'll just keep a spare drive available to serve as the second parity drive... or I may not. BUT, on the off chance, that I'll need to start from scratch to have the one feature that has been the sticking point for me adopting unRAID for years, makes me want to reach out to the community for an answer to a specific question before I pull the trigger on purchasing the pro-license that I'll need for all the drives I want in the system.
  12. So... I've had my eye on an unRAID setup for a few years now and I've finally gathered all the hardware. Now I need to decide if it's the right time to pull the trigger. My main concern is: Can I build the server now (9 or 10 3TB drives to start), or do I need to wait till the Dual Parity feature makes it into the product? To be quite honest, the lack of dual parity has been the main reason I've not pursued the project for the past couple of years, I've had much of the hardware ready to go for most of this time. I don't want to spin my wheels now, purchasing the license(s) and building the server; and then find out that I've some how backed myself into a corner and not be able to utilize the new feature. Also, I intend to add more 3TB drives to the system once I have the server up and running and I can re-purpose the drives that now serve as back-ups (i.e., preclear them and be confident in the SMART status). Would this cause any trouble? Me thinks not, but I want to rest assured. Thanks!