Jump to content

JorgeB

Moderators
  • Posts

    67,540
  • Joined

  • Last visited

  • Days Won

    707

Everything posted by JorgeB

  1. Cache device is dropping offline: Nov 2 09:36:58 Tower kernel: pcieport 0000:00:06.0: AER: Uncorrected (Fatal) error received: 0000:00:00.0 Nov 2 09:36:58 Tower kernel: nvme 0000:05:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Inaccessible, (Unregistered Agent ID) Nov 2 09:37:29 Tower kernel: nvme nvme0: I/O 710 QID 14 timeout, aborting Nov 2 09:37:29 Tower kernel: nvme nvme0: I/O 711 QID 14 timeout, aborting Nov 2 09:37:29 Tower kernel: nvme nvme0: I/O 712 QID 14 timeout, aborting Nov 2 09:37:29 Tower kernel: nvme nvme0: I/O 713 QID 14 timeout, aborting Nov 2 09:37:29 Tower kernel: nvme nvme0: I/O 714 QID 14 timeout, aborting Nov 2 09:37:59 Tower kernel: nvme nvme0: I/O 710 QID 14 timeout, reset controller Nov 2 09:38:29 Tower kernel: nvme nvme0: I/O 0 QID 0 timeout, reset controller Nov 2 09:38:34 Tower kernel: nvme nvme0: Device shutdown incomplete; abort shutdown Previous reports of issues with Adata devices, if you can try a different brand.
  2. One of the cache devices has been dropping: Nov 2 07:45:29 Tower kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 62451, rd 3119, flush 1898, corrupt 0, gen 0 More info here.
  3. Possibly the 3.3v issue, google "wd sata 3.3v pin"
  4. You can do a new config (Tools -> New config), parity will need to be re-synced.
  5. Parity should be the same as the other drives on the server, just avoid SMR drives. This
  6. See here for some benchmarks on the possible performance with various controllers, of course board/CPU are also a factor, i.e. a PCIe 3.0 HBA will only perform optimally with a PCIe 3.0 board/CPU.
  7. Ahh, missed that, in that case yes, though you could run dual link to the front backplane and then have that one cascade to the back one, it should still give better performance, though probably not much of a difference with PCIe 2.0, PCIe 3.0 would be required for the slot itself not to be a bottleneck with dual link, still worth trying.
  8. Sorry, no, you can run a scrub on the pool, but if the vdisks are on a NOCOW share they can't be fixed, more info here, also don't forget that you should always have backups of anything important.
  9. Since emulated disk3 is mounting correctly you can do the same to it, you could even have done both at the same time.
  10. Yes, but for some reason it's known to be slow with current releases, never investigated further and the scrip author as been MIA for years.
  11. Plugin author hasn't been on the forums for a month or so, hopefully just real life getting in the way and he will be back soon.
  12. It won't get overwritten if you follow the steps above carefully, but you can clone it dd if you want. dd if=/dev/sdX of=/dev/sdY bs=4k X=source Y=dest
  13. You could possibly run the invalid slot command with an unassigned disk2, then run the parity swap, but invalid slot doesn't always work without a disk assigned, and I can't test if it does on v5.0.6 now since I'm about to go out for the day, but can test tomorrow if you want.
  14. Because current parity wasn't valid, parity swap won't work while parity is invalid.
  15. Yes. Correct, same size or larger than old disk2, but not larger than current parity.
  16. Luckily v5.0.6 still boots on my test server, procedure is this: -stop array and take a note of all the current assignments -Utils -> New Config -> Yes I want to do this -> Apply -Back on the main page, assign all the disks as they were as well as old parity and new disk2, double check all assignments are correct -Important - After checking the assignments leave the browser on that page, the "Main" page. -Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters): mdcmd set invalidslot 2 -Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box, disk2 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check.
  17. There have been various reports of the script being very slow with current releases, you can always run it manually.
  18. Macvlan call traces are usually related to having dockers with a custom IP address, see here:
  19. That's exactly what it looks like, different board will change the IDs (sometimes just adding/removing some hardware does the same), you need to edit the VM XML and/or binds and change them.
  20. That is all you need for dual link. Where are you seeing this? This is showing an x8 PCIe 2.0 link, as it should be.
  21. Looks like the problem started with a power failure: Oct 31 02:05:36 RIAAHQ apcupsd[3244]: Power failure. Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:0:0: device_block, handle(0x000a) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:1:0: device_block, handle(0x000b) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:2:0: device_block, handle(0x000c) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:3:0: device_block, handle(0x000d) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:4:0: device_block, handle(0x000e) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:5:0: device_block, handle(0x000f) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:6:0: device_block, handle(0x0010) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:7:0: device_block, handle(0x0011) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:8:0: device_block, handle(0x0012) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:9:0: device_block, handle(0x0013) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:10:0: device_block, handle(0x0014) Oct 31 02:05:38 RIAAHQ kernel: sd 2:0:11:0: device_block, handle(0x0015) Are all the disks on the same UPS? Or are they in some kind of separate enclosure? Reboot to clear the errors (disable disks will remain disable), then start the array and post new diags.
×
×
  • Create New...