bjmi Posted March 6, 2020 Share Posted March 6, 2020 Hallo, WinSCP uses an auto detection method for filename encoding by quering LANG environment variable printenv LANG and activates UTF-8 filenames if the result contains UTF-8 like: en_US.UTF-8 Unfortunately this isn't the case for non-interactive sessions in Unraid. How can I set LANG variable to en_US.UTF-8 for non-interactive session? Ref: https://winscp.net/forum/viewtopic.php?p=96851#96851 Quote Link to comment
ken-ji Posted March 8, 2020 Share Posted March 8, 2020 What language are your filenames in? I don:t have that issue with my files in Japanese. And which direction is the transfer giving you issues? Quote Link to comment
bjmi Posted March 9, 2020 Author Share Posted March 9, 2020 (edited) My preferred language is German where ÄäÖöÜuß– characters are used additionally. I did an initial import of my files to a user share by WinSCP (SCP method). When I accessed the same share with Windows Explorer (Windows 10, Client for Microsoft Networks) some files were missing or had bogus characters in their filenames. Samba hides those files but they were in the filesystem (ls -l shows them). It turned out that WinSCP didn't detect UTF-8 and use another code page for filenames. I could rename those files to proper UTF-8 filenames with an adapted version of iconvmv script (https://github.com/YeLee/code/blob/master/shell/iconvmv) using iconv -f ISO-8859-1 in the mv command. In order preserve other unraid users from this issue how can environment variable LANG be filled with en_US.UTF-8 for non-interactive sessions? In /etc/profile.d/lang.sh this LANG variable is exported but isn't filled when WinSCP queries it. I don't have this behaviour with Ubuntu Server. Additionally dos charset was set to cp1252 in /boot/config/smb-extra.conf afterwards. dos charset = cp1252 Edited March 9, 2020 by bjmi Quote Link to comment
ken-ji Posted March 9, 2020 Share Posted March 9, 2020 Hmm. I tried uploading filenames in with JPN characters, but I don't see the issue. Logging in via terminal shows the filename correctly in UTF-8. attempting to LS without UTF8 in the LANG shows question marks instead of a proper filename. I do see that winscp can be forced to assume that UTF-8 is enabled before you connect to the server in the advanced settings. Maybe, your issue is that the filenames on you original files are not in UTF8 but in the native iso-1252 latin, and winscp copied the filenames as is, resulting in native latin filenames, which Samba assumes they are invalid UTF-8 filenames and hides them to avoid weird things from happening on the client side. Other than this I have no idea what else can be done, other than trenaming the files to UTF-8, and making sure future SCP uploads have the UTF setting forced. Quote Link to comment
bjmi Posted March 10, 2020 Author Share Posted March 10, 2020 On 3/9/2020 at 3:14 PM, ken-ji said: Maybe, your issue is that the filenames on you original files are not in UTF8 but in the native iso-1252 latin, and winscp copied the filenames as is, resulting in native latin filenames, which Samba assumes they are invalid UTF-8 filenames and hides them to avoid weird things from happening on the client side. That is what probably happened. On 3/9/2020 at 3:14 PM, ken-ji said: Other than this I have no idea what else can be done, other than trenaming the files to UTF-8, and making sure future SCP uploads have the UTF setting forced. That's why I want to export LANG variable and WinSCP encoding detection works again. Quote Link to comment
bjmi Posted March 18, 2020 Author Share Posted March 18, 2020 (edited) Total Commander SFTP plugin suffers from this problem too. It auto-detects Encoding (codepage) of file names: ANSI (local) (0) which scrambles non ANSI characters in file names. Edited July 23, 2020 by bjmi Quote Link to comment
ken-ji Posted March 18, 2020 Share Posted March 18, 2020 I guess its technically a never seen limitation as most users either upload files via Samba; do OS level disk to disk; or download the files from the internet. So I guess you should submit and Feature/Enhancement request in the correct board, and enable manually the UTF-8 setting. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.