bkastner

Members
  • Posts

    1198
  • Joined

  • Last visited

Everything posted by bkastner

  1. I've attached the log. I rebooted only because I've had something similar in the past happen, and a reboot brought the disks back online normally. I wanted to confirm this didn't fix the issue before submitting to the forum. I will wait for a bit to give you (or whoever) a chance to review the previous log before I do the rebuild, but am glad to hear the disk looks good - but am confused as to what happened. Hopefully the previous log shows something. syslog-20150903-145649.zip
  2. Is anyone able to review the diagnostics and provide recommendations on next steps? Thanks
  3. Here you go. cydstorage-diagnostics-20150903-1603.zip
  4. So while trying to help someone else diagnose an issue with the SAS2LP card I had started a parity check, and just left it running. I got an email notification after a few hours of a failed drive (even though it was fine on Sept 1st during the regular parity check). In the GUI I see the drive X'd out with 'device is disabled, contents emulated', but looking at the disk I don't really see any issues: SMART self-test history: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 12738 - SMART error log: No Errors Logged # Attribute Name Flag Value Worst Threshold Type Updated Failed Raw Value 1 Raw Read Error Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin Up Time 0x0027 178 176 021 Pre-fail Always - 8091 4 Start Stop Count 0x0032 098 098 000 Old age Always - 2333 5 Reallocated Sector Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek Error Rate 0x002e 100 253 000 Old age Always - 0 9 Power On Hours 0x0032 083 083 000 Old age Always - 12739 (1y, 165d, 19h) 10 Spin Retry Count 0x0032 100 100 000 Old age Always - 0 11 Calibration Retry Count 0x0032 100 253 000 Old age Always - 0 12 Power Cycle Count 0x0032 100 100 000 Old age Always - 29 192 Power-Off Retract Count 0x0032 200 200 000 Old age Always - 12 193 Load Cycle Count 0x0032 199 199 000 Old age Always - 4748 194 Temperature Celsius 0x0022 122 108 000 Old age Always - 30 196 Reallocated Event Count 0x0032 200 200 000 Old age Always - 0 197 Current Pending Sector 0x0032 200 200 000 Old age Always - 0 198 Offline Uncorrectable 0x0030 200 200 000 Old age Offline - 0 199 UDMA CRC Error Count 0x0032 200 200 000 Old age Always - 0 200 Multi Zone Error Rate 0x0008 200 200 000 Old age Offline - 0 I've tried rebooting, but the drive still shows as failed. What would be my next steps to try? Should I replace it right away, or is the drive likely okay and something else the issue? Before I get asked, these drives are attached to a backplane on a Norco 4224, and I have not even been in the same room as the server in almost a week - nothing was jarred, jangled, nudged, bumped, or given a dirty look.
  5. I started a parity check and get the following: Total size: 6 TB Elapsed time: 53 minutes Current position: 342 GB (5.7 %) Estimated speed: 109.4 MB/sec Estimated finish: 14 hours, 22 minutes I agree it would be great if I was seeing 150 MB/sec, but I still don't think it's that bad.
  6. This results in nothing coming back for me. root@CydStorage:~# lspci -vv -d 11ab:* root@CydStorage:~# Instead of 11ab, try 1b4b That seems to have worked better: root@CydStorage:~# lspci -vv -d 1b4b:* 01:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev 03) Subsystem: Marvell Technology Group Ltd. Device 9480 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 16 Region 0: Memory at f7340000 (64-bit, non-prefetchable) [size=128K] Region 2: Memory at f7300000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at f7360000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM L0s Enabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [140 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 Status: NegoPending- InProgress- Kernel driver in use: mvsas Kernel modules: mvsas 02:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev 03) Subsystem: Marvell Technology Group Ltd. Device 9480 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 17 Region 0: Memory at f7240000 (64-bit, non-prefetchable) [size=128K] Region 2: Memory at f7200000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at f7260000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM L0s Enabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [140 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 Status: NegoPending- InProgress- Kernel driver in use: mvsas Kernel modules: mvsas root@CydStorage:~# As a reminder, I am one of the people with these cards and no issues.
  7. This results in nothing coming back for me. root@CydStorage:~# lspci -vv -d 11ab:* root@CydStorage:~#
  8. I have 7 drives on the card and my speed is cut in half from V5 to V6 during parity checks which then take more than a day instead of going over night. So, back to the question I posted earlier? Have you tried the tunables script at all? http://lime-technology.com/forum/index.php?topic=29009.0 I would highly suggest you use this to refine your parameters and see where it leaves you. Moving to v6 is also moving to 64-bit which can change things. I think dropping 50% is definitely extreme, but with a correctly tuned environment this may be substantially reduced.
  9. Your case is different, at the 3Tb mark you lose half your drives, and at the 4tb mark you only have left the two 6tb drives, this will highly inflate your average speed, based on my tests your starting speed is around 90 – 100Mb/s, feel free to post a screenshot. This is from a test I did on an unlimited server with drives similar to yours, parity check start’s at around 150Mb/s: Duration: 15 hours, 33 minutes, 40 seconds. Average speed: 107.1 MB/sec So you’re still limited, although in your case is not a big difference because of the various drive sizes. Fair enough, but also to RobJ's point - this is an event that happens once a month - as a maintenance task, so +/- 10MB/sec difference between friends isn't causing me a lot of sleepless nights. To be fair, if I was around 30-40MB/sec and knew the average was ~90-140MB/sec I would want to better understand the discrepancy, but I find my speed reasonable, so am not overly worried about it. That being said, I am happy to contribute my settings for others to compare against to help determine where the differences lie to help determine root cause. I know one thread indicated that having a mix of MB based SATA and SAS2LP based SATA could give much worse results than having all drives on the SAS2LP cards - however since this has always been my config I can't validate it - but it may be worth checking out for those with issues as a test.
  10. Have you guys tried using the tunables script? I am one of those users who have 2 SAS2LP cards, and all my drives are on them (none on the MB) and have had no issues. Here is my parity check from yesterday: Last checked on Tue 01 Sep 2015 05:45:14 PM EDT (yesterday), finding 0 errors. Duration: 17 hours, 45 minutes, 13 seconds. Average speed: 93.9 MB/sec Some users get higher, but I don't think this is an unreasonable speed. There so seem to be a few users with issues with these cards, but there seems to be at least as many (and likely quite a few more) that run these cards without issues. This leads one to believe that the issue may not be with UnRAID per se, but something in various user's configs.
  11. Out of curiosity, did you ever try changing out the power supply, or is that still a future troubleshooting step?
  12. It may be DNS. You should have a hosts file on your pc (under c:\Windows\System32\Drivers\etc Edit it in Notepad with the following format: 192.168.XXX.XXX mytower Make sure to save it back as hosts, and not hosts.txt (there should be no extension). From a CMD window you should be able to 'ping mytower' and confirm you are pinging the server. You can then use the hostname in a browser.
  13. Not really. In order for the disks to be usable in UnRAID you will want to boot UnRAID up, pre-clear the disks and then format them using UnRAID. This means they can't contain any existing data on them. You will likely need to get UnRAID up and running, and then from another PC move the data across the network from a Windows machine with your existing data drives (either internally mounted or using an external caddy). As mr-hexen mentioned, there is an element of risk using your existing drives, but theoretically, once they are cleared of data you can add them to the UnRAID server and then run the pre-clear process on them (possibly multiple times). You can also then run a SMART report on the disks and post the results here where people can provide recommendations on whether those disks are worth adding to UnRAID, or whether you are better avoiding them.
  14. It is very easy to increase the size of the Docker image file. Under settings -> docker first disable docker, this will allow you to enter a new size value. E.g. change 10G to 20G and re-enable docker. It will just expand? I always thought that would create a new container and require me to redownload the Dockers. It will just expand, without creating anything new or loosing anything existing. Great to know. I just tried it and it worked great. I also got an updated notification that my disk utilization has returned to normal. Great addition! I much prefer finding out that I am running out of space this way, verses when there's an actual issue.
  15. It is very easy to increase the size of the Docker image file. Under settings -> docker first disable docker, this will allow you to enter a new size value. E.g. change 10G to 20G and re-enable docker. It will just expand? I always thought that would create a new container and require me to redownload the Dockers.
  16. I just upgraded to rc6 and received an email notification that my Docker image disk utilization is at 81%. Very handy notification, but it raises the question on how I should go about addressing this. I apologize if this is not rc6 related per se, but since it seems to be a rc6 addition I am guessing others may come here once seeing the notification as well.
  17. Yes, you can just format a USB drive with UnRAID and boot to it without touching any of your internal disks. You should be able to see if it boots okay, and you can try and connect to the GUI from another machine to confirm it's on the network successfully.
  18. I have to say I wasn't really aware of issues with these cards. I've used 2 of them for my drives for a couple of years without issue - other than having to set my WD Red 6TB parity drive to never spin down (though I don't need to do this for data drives). However, I have a Norco case with backplanes, so have always had all my drives on the SAS2LP cards and have never tried to combo them with my motherboard SATA controllers, so maybe was lucky and avoided some of the headaches.
  19. Given your budget I agree with redlaw on checking your existing hardware. You need to determine if your existing system will boot up UnRAID and confirm it's accessible on the network. If so, I would suggest you stick with this for now and just worry about storage. Since you are looking at 6TB of usable storage I would likely start with 2 6TB drives (WD Red), one for parity and one for data. However that is going to cost $420 plus taxes in Canada. If you want to reduce this you could buy 3 3TB disks, so you are using 1 for parity and 2 for data, which will come in quite a bit cheaper (3TB WD Reds are $140, so only $320 plus taxes). However I would personally avoid filling up a case with 3TB disks today (or 2TB disks as redlaw suggests) and invest in the future by starting with 6TB disks. If your system boots UnRAID and can be accessed on the network, then with the above you are good to go, and can always swap out parts down the road. If you can't boot UnRAID for some reason, then this is a whole other story.
  20. I have to say, after reading the initial OP my first thought was PSU. I've had flaky issues in the past that were due to underpowered PSU, but I actually own the same one you did as I overpurchased to eliminate this issue in the future. That being said, there may be an issue there. I would suggest finding somewhere local with a good return policy and try a new PSU - if it doesn't fix the issue you can always return it.
  21. I was hoping to set this up on my home network. How did you redirect the user folders? I would also like to have these folders available offline on my laptop. Not sure if I am just trying to make things more complicated then what they are.... Roland Depending on the amount of data you have this may or may not be the best solution today. For reasonable amounts of data OneDrive makes way more sense. For my work PCs I sync documents, pictures and IE favorites this way so regardless of what PC I am on I have access to all this content. For my home environment I have 100GB of music and roughly the same of family pictures. I've created a music share on UnRAID for iTunes and have all my PCs in the house point to it and use a common library. I also have a Pictures share and replicate to CrashPlan, but I point my local pictures folder and to OneDrive (if that makes sense). I also "share" my documents folder with my work account so it's all in one place. With OneDrive you get 15GB for free (I think) and 200GB for $4/month. Being able to sync to the cloud makes this content accessible on my PCs, phones, tablets, or when I am visiting family, and since it's built into Windows 8.1 and Windows 10 it's very easy to set up. I am not saying the OP is wrong in how he has things set up, but I just don't think it makes the most sense today given the available options.
  22. I thought the process was once you had downloaded a beta/RC that you should be able to check for updates on the plugins page and find/install the latest RC? I understand you having to manually enter yourself into the RC stream, but once there why would the auto-update not be enabled? Or is this something unique to RC3?
  23. Yes, it's pretty ridiculous. To be honest it was one of those things I did without really thinking about how often I am really going to do this. One of my dumber choices.
  24. Yes, it does use Java, and doesn't seem to work at all for me. I really like the SuperMicro board, but have been disappointed by the whole IPMI piece (management and BIOS update). These are not reasons I would ever buy one again (though I would if looking at a Xeon solution again).
  25. I can confirm that the item garycase mentions is the proper license for the BIOS update via IPMI. I've had a horrible time with CDW though buying it. I am in Canada and it was $40. I purchased 3 weeks ago, and heard nothing for a week, then I was asked for the board SN, and then nothing for a week, then I was asked for the MAC address, and almost a week later I finally got the code. Since it's supposed to take 48-72 hours and actually took 3 weeks I am trying to push for a refund, but don't know how far I will get. I hounded CDW almost daily, and they had issues with their distribution center and then with SuperMicro direct. It was a crappy process overall, and not one I would recommend without serious patience (though I would also have your SN and MAC addresses handy from day one). I did confirm the key provided works, and I was able to update my BIOS - however since the key is tied to the motherboard, and I may do this once or twice more over the life of the board it was likely not a worthwhile purchase overall.