Click to See Complete Forum and Search --> : Correcting samba's swedish characters


mumiemonstret
12-10-2002, 10:31 AM
I moved several gigs of small files from Windows to Linux. Every filename that contained a Swedish character became unaccessible from Linux, as the filename then contained an unprintable character. ("fråga.bat" became "fr?ga.bat").

Recently I found out that the directive "character set = ISO8859-1" in smb.conf solves the problem.

Unfortunately, the files with unprintable characters disappears from Window's view when I add the directive!The files are too many to rename by hand, and too large to be copied back and forth to a windows harddrive.

Can anyone tell me how I can batch rename the unprintable characters for correct ones in Linux, or otherwise solve the problem?

pauper
12-12-2002, 12:19 AM
Since windows is the only OS you have that can see the extended characters (å, etc.) then you should probably use windows to effect the conversion. This brings about its own special set of problems:
You showed a (.bat) batch file as the example. Does this call or is it called by any other batch or executable?
How many extended characters are we talking about here? Is it just the 5 vowels or are there more?
Do you have a decent spreadsheet available in windows?
The last question may seem a little odd, but I have found that excel is great for manipulation of massive ammounts of filenames.
The procedure is fairly simple:
[list=1]
Do a subdirectory listing in windows command prommpt to get all the files you are interested in. ie: DIR /s /b | find "å" >> c:\myfiles.txt
Repeat the above step for each of the other non-standard characters involved
Open the finished myfiles.txt in your spreadsheet, import it as text, delimited, with no delimiters, one single column.
Use formula on each row to parse out the file name (find the last '\' in the string and take everything to the right) into another cell on the same row.
Use the newly extracted filenames, via formula, to extract the leftmost (length of the total string - length of the filename) characters to another cell on the same row.
If there's only a few characters to replace, do a bulk search-and-replace of them with the extracted filename column only. If there's a lot, you might want to consider a lookup table and formulae.
Concatenate the updated filenames onto the extracted pathnames - paste them in as values not formulas.
Insert a column at the far left to take the DOS ren command
Insert a single column between the old path/filename and the new path/filename
Do a search for unique strings that are NOT used (I usually 1@, 2@ etc.)
In the first row, first cell (the new column), insert 'REN 1@' (without the quotes)
In the third column, insert '2@ 1@' (again without the quotes, leave a space between them)
In the fifth column, insert '2@' without the quotes
copy these three cells down the entire length of the files. Delete any extraneous data in the spreadsheet.
Resize the columns to accommodate the widths of the text and close the spreadsheet, saving the file.
Open the text file with notepad, ultraedit or another text editor.
Search and replace all quotes (") with nothing
Search and replace all '1@ ' (followed by a space) with just the '1@'. Do this repeatedly until 'no find'.
Repeat the previous step with 2@ to eliminate any spaces BEFORE the 2@
Replace all 1@ and all 2@ with a double quote
Save the file and exit back to DOS
Rename your new file with a bat extension, cross your fingers, close your eyes tight and run it!
[/list=1] A bit long winded, yep! But 5 minutes of this beats the heck out of several hours renaming hundreds of individual files!!
Having said all this, there may be a way to do it in linux, but I really don't know.
Why all the quotes? Long file names. Windows understands them but DOS doesn't, so you surrond the entire name with quotes to make sure it gets the idea!

Tip: When you run this, its going to fly by and you can't scroll back, so run it as c:\myfiles.bat > c:\results.txt so you can browse afterwards to look for any errors.

Hope that helps!! :D