Joined: 21 Sep 2003 Posts: 16777097 Location: Portugal
Posted: Sun Jun 08, 2008 5:00 pm Post subject:
Exactly. The problem here is that the original html contains characters that don't exist in the reduced ASCII set. Things like curved quotes (“) or the Euro sign (€), for example.
When saving to text, the browser is probably saving either to strict 7-bit ASCII (or maybe ISO-8859-1, also known as Latin1), or to the encoding specified by your locale settings. The problem is that whichever encoding it's using seems to not include some of the original characters.
The solution would be to normalize the characters so that fancy stuff like curved quotes and so on is transformed to more standard characters like ". This, however, may not be easy to accomplish from the browser.
As Elderan pointed out, saving as UTF-8 would be another solution - as UTF-8 can by definition encode all Unicode characters. This would mean, however, that you'd need a text editor that can understand UTF-8 to read the fancy characters in the text file, but that shouldn't be a problem for virtually every modern text editor.
Unfortunately I don't really know if there's a way to choose the encoding used when you save text in Firefox, or which encoding it uses to save in the first place. Perhaps asking in the Mozilla forums might help.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum