Minggu, 17 Mei 2015

Bristol Digest, Vol 600, Issue 5

Send Bristol mailing list submissions to
bristol@mailman.lug.org.uk

To subscribe or unsubscribe via the World Wide Web, visit
https://mailman.lug.org.uk/mailman/listinfo/bristol
or, via email, send a message with subject or body 'help' to
bristol-request@mailman.lug.org.uk

You can reach the person managing the list at
bristol-owner@mailman.lug.org.uk

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bristol digest..."


Today's Topics:

1. Re: Forrin characters in web pages (Andrew McLean)


----------------------------------------------------------------------

Message: 1
Date: Sat, 16 May 2015 16:53:01 +0100
From: Andrew McLean <am57762@gmail.com>
To: LUG <bristol@mailman.lug.org.uk>
Subject: Re: [bristol] Forrin characters in web pages
Message-ID: <555767DD.9010502@gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed

Thanks for the various comments. I am using UTF-8 (Mint 17.1) etc.
The real problem is that the same names appear on different web pages
sometimes with an accented character (e.g. c-cedilla in "Francois") and
sometimes with the equivalent non-accented character (plain 'c' in this
example). I need to find the same name in different pages where in the
raw HTML files it is spelt differently.
So it's not really about how the data is handled on my system, its the fact
that the same data (someone's name, in this case) is presented differently
on different web pages.
I have now written my own C code to find these chars and replace them
with the plain equivalent. Fiddly, and I should probably learn Perl...
Andrew




------------------------------

_______________________________________________
Bristol mailing list
Bristol@mailman.lug.org.uk
https://mailman.lug.org.uk/mailman/listinfo/bristol

End of Bristol Digest, Vol 600, Issue 5
***************************************

Tidak ada komentar:

Posting Komentar