Keint Posted August 18, 2022 Share Posted August 18, 2022 Hello I recently has two problems with two ssds, I lost the first one few days ago, but I had a time to copy my files to another disque, after that I deleted the ssd of my array and make a new config without this one. After that, another ssd crashed, I wasn't able to go inside the folders, instead of that I had: NO LISTING: TOO MANY FILES. So I tried to repair the disk using the method: Checking and fixing drives in the webGui. I tried but unfortunately without success, I wasn't able to run: -nv or any others commands, nothing happen after click on "check" So I switch from maintenance mode to normal and from this action the disk 9 is now mentioned like: "Not installed" ... I m a bit lost 😅 I m of course afraid to lost all the data from the disk 9... Please kindly find attached my diag files... Thanks for your time tower-diagnostics-20220819-0100.zip Quote Link to comment
trurl Posted August 19, 2022 Share Posted August 19, 2022 SSDs in the array can only be written at parity speed, and cannot be trimmed. No SMART report for disk9. Reseat controller, check connections, both ends, SATA and power, including splitters. Reboot and post new diagnostics Quote Link to comment
Keint Posted August 29, 2022 Author Share Posted August 29, 2022 thanks for your answer, I m just back in my home and have finally access to my Unraid. I disconnect the ssd and connect and now I ve got: Unmountable: not mounted Unraid propose me to format the drive ... Is it safe to format it ? My two parity disk look good. please kindly find attached the diag. Thank You ! tower-diagnostics-20220829-1048.zip Quote Link to comment
JorgeB Posted August 29, 2022 Share Posted August 29, 2022 Diags only show logged infor until August 14th, reboot and post new diags after array start. Quote Link to comment
Keint Posted August 29, 2022 Author Share Posted August 29, 2022 Thanks for your fast answer ! Here we go, I reboot and diag again tower-diagnostics-20220829-1150.zip Quote Link to comment
JorgeB Posted August 29, 2022 Share Posted August 29, 2022 You should not have started a rebuild, anyway the SSD assigned as disk9 dropped offline, stop the array, unassign disk9, check filesystem on the emulated disk9. Quote Link to comment
trurl Posted August 29, 2022 Share Posted August 29, 2022 4 hours ago, Keint said: Unraid propose me to format the drive ... Is it safe to format it ? Just thought I would answer this question directly. NEVER format a disk that has data you want to keep. When you format a disk in the array, parity is updated, and so rebuild can only result in a formatted disk. The correct way to deal with unmountable is with check filesystem, as already suggested. Quote Link to comment
Keint Posted September 9, 2022 Author Share Posted September 9, 2022 (edited) Hello, I m just back in town, I check again de ssds, I have a new crash again from another disque, totally 4 disks crashed, I don t know what to do, I lost 4 TB of data I tried to repair the disk using the method: Checking and fixing drives in the webGui. Nothing happen, the drives are still ''Unmountable: not mounted'' I start to be desperate ! I can t access to the data on the failing drivers ... Is there any way to see what is on the disk and copy like an external hard drive on Mac ? I switched the data cable, nothing change ... Maybe the sata pci card? on DISK SPEED I ve got this error: DiskSpeed - Disk Diagnostics & Reporting tool Version: 2.9.4 Scanning Hardware 09:16:27 Spinning up hard drives 09:16:27 Scanning system storage 09:16:28 Scanning USB Bus 09:16:37 Scanning hard drives 09:16:40 Scanning storage controllers 09:16:41 Scanning USB hubs & devices 09:16:42 Scanning motherboard resources 09:16:42 Fetching known drive vendors from the Hard Drive Database 09:16:43 Found controller SAS2308 PCI-Express Fusion-MPT SAS-2 09:16:43 Found drive Micron Micron_5210_MTFDDAK7T6QDE Rev: D2MU805 Serial: 20212847847F (sdh), 1 partition 09:16:43 Found drive Micron Micron_5300_MTFDDAK1T9TDT Rev: D3MU001 Serial: 20292935A463 (sdi), 1 partition Lucee 5.2.9.31 Error (expression) Messageinvalid call of the function listGetAt, second Argument (posNumber) is invalid, invalid string list index [2] patternlistgetat(list:string, position:number, [delimiters:string, [includeEmptyFields:boolean]]):string StacktraceThe Error Occurred in /var/www/ScanControllers.cfm: line 1733 1731: <CFSET NR=i-2> 1732: <CFSET Part.Partitions[NR].PartNo=ListGetAt(CurrLine,1,":",true)> 1733: <CFSET Part.Partitions[NR].Start=Val(ListGetAt(CurrLine,2,":",true))> 1734: <CFSET Part.Partitions[NR].End=Val(ListGetAt(CurrLine,3,":",true))> 1735: <CFSET Part.Partitions[NR].Size=Val(ListGetAt(CurrLine,4,":",true))> called from /var/www/ScanControllers.cfm: line 1643 1641: </CFIF> 1642: </CFLOOP> 1643: </CFLOOP> 1644: 1645: <!--- Admin drive creation ---> Java Stacktracelucee.runtime.exp.FunctionException: invalid call of the function listGetAt, second Argument (posNumber) is invalid, invalid string list index [2] at lucee.runtime.functions.list.ListGetAt.call(ListGetAt.java:46) at scancontrollers_cfm$cf.call_000163(/ScanControllers.cfm:1733) at scancontrollers_cfm$cf.call(/ScanControllers.cfm:1643) at lucee.runtime.PageContextImpl._doInclude(PageContextImpl.java:933) at lucee.runtime.PageContextImpl._doInclude(PageContextImpl.java:823) at lucee.runtime.listener.ClassicAppListener._onRequest(ClassicAppListener.java:66) at lucee.runtime.listener.MixedAppListener.onRequest(MixedAppListener.java:45) at lucee.runtime.PageContextImpl.execute(PageContextImpl.java:2464) at lucee.runtime.PageContextImpl._execute(PageContextImpl.java:2454) at lucee.runtime.PageContextImpl.executeCFML(PageContextImpl.java:2427) at lucee.runtime.engine.Request.exe(Request.java:44) at lucee.runtime.engine.CFMLEngineImpl._service(CFMLEngineImpl.java:1090) at lucee.runtime.engine.CFMLEngineImpl.serviceCFML(CFMLEngineImpl.java:1038) at lucee.loader.engine.CFMLEngineWrapper.serviceCFML(CFMLEngineWrapper.java:102) at lucee.loader.servlet.CFMLServlet.service(CFMLServlet.java:51) at javax.servlet.http.HttpServlet.service(HttpServlet.java:729) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:292) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:240) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:207) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:212) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:94) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:492) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:141) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:80) at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:620) at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:684) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:502) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1152) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:684) at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2527) at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:2516) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748) Timestamp9/9/22 9:16:43 AM CEST Thanks !!!! Cheers! tower-diagnostics-20220909-0907.zip Edited September 9, 2022 by Keint Quote Link to comment
JorgeB Posted September 9, 2022 Share Posted September 9, 2022 Diags are after rebooting so we cannot see what happened, for now check filesystem on disks 8 and 10, don't format anything. Quote Link to comment
trurl Posted September 9, 2022 Share Posted September 9, 2022 2 hours ago, JorgeB said: check filesystem on disks 8 and 10 Be sure to capture the output so you can post it. Quote Link to comment
Keint Posted September 10, 2022 Author Share Posted September 10, 2022 Thanks for your help disk 8 Phase 1 - find and verify superblock... - block cache size set to 3057592 entries sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129 would reset superblock realtime bitmap inode pointer to 129 sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130 would reset superblock realtime summary inode pointer to 130 Phase 2 - using internal log - zero log... zero_log: head block 38204 tail block 38200 ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... sb_icount 0, counted 6016 sb_ifree 0, counted 180 sb_fdblocks 976277431, counted 156280199 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 3 - agno = 2 - agno = 1 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Sat Sep 10 19:10:43 2022 Phase Start End Duration Phase 1: 09/10 19:10:43 09/10 19:10:43 Phase 2: 09/10 19:10:43 09/10 19:10:43 Phase 3: 09/10 19:10:43 09/10 19:10:43 Phase 4: 09/10 19:10:43 09/10 19:10:43 Phase 5: Skipped Phase 6: 09/10 19:10:43 09/10 19:10:43 Phase 7: 09/10 19:10:43 09/10 19:10:43 Total run time: disk 10 Phase 1 - find and verify superblock... - block cache size set to 3073088 entries sb realtime bitmap inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 129 would reset superblock realtime bitmap inode pointer to 129 sb realtime summary inode value 18446744073709551615 (NULLFSINO) inconsistent with calculated value 130 would reset superblock realtime summary inode pointer to 130 Phase 2 - using internal log - zero log... zero_log: head block 20168 tail block 20164 ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... sb_icount 0, counted 64 sb_ifree 0, counted 59 sb_fdblocks 468614399, counted 468614391 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 1 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Sat Sep 10 19:13:43 2022 Phase Start End Duration Phase 1: 09/10 19:13:43 09/10 19:13:43 Phase 2: 09/10 19:13:43 09/10 19:13:43 Phase 3: 09/10 19:13:43 09/10 19:13:43 Phase 4: 09/10 19:13:43 09/10 19:13:43 Phase 5: Skipped Phase 6: 09/10 19:13:43 09/10 19:13:43 Phase 7: 09/10 19:13:43 09/10 19:13:43 Total run time: There is a data rebuilding on the disks I can now see disk 8 and 10 datas which look ok to copy to another disk 😅 First log is in maintenance mode seconde diag is aftter switch in normal mode tower-diagnostics-20220910-1911-2.zip tower-diagnostics-20220910-1916.zip Quote Link to comment
Keint Posted September 13, 2022 Author Share Posted September 13, 2022 On 9/11/2022 at 10:57 AM, JorgeB said: Looking OK. Yep Everything is back to normal the SSDs I used had a problem ERRORNOD of samsung pm863a I removed them rebuilt the parity and now all is working well ! Thanks again for all your precious help ! Cheers ! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.