Gelukkig is hij nooit “chanteur pour femmes finissantes” geworden. Marc Almond daarentegen… (Niet op de link klikken als u nog van uw middagmaal wil genieten)
2008-10 Archive
Terug van weggeweest
2008-10-03 - Nederlands, Notes - Reply
Als u dit leest, zit u op een andere server, die waarschijnlijk een beetje sneller zal zijn. Voor de rest is er (nog) niks veranderd, buiten wat technisch gerommel…
WordPress Database conversion from latin-1 to UTF-8
2008-10-03 - Notes, WordPress - 1 comment
If you have an old WordPress blog like me, you’ll notice all kinds of problems with accented letters in late WordPress’ versions. The trouble is that WordPress was once young and foolish and created its MySQL database in the default latin-1 character set. Which was all fine and dandy, except the fact that WordPress dumped UTF-8 encoded unicode data into this database.
MySQL didn’t mind and PHP didn’t know anything about Unicode, so no harm done. The trouble began when WordPress actually started requesting UTF-8 data from MySQL. MySQL notices that the data in the tables is in latin-1 format and converts the latin-1 data to UTF-8.
That means that your data is double-encoded. één becomes één and so on. One possible way to solve this problem is to leave the communication between WordPress and MySQL in latin-1. (DB_charset = 'latin1'). The best solution, however, is to fix the database: mark all fields, tables, and the database charset as UTF-8. Trouble is, whenever you do this in for instance phpMyAdmin, MySQL converts the actual data to UTF-8, thus double-encoding the data once and for good.
The solution for this new problem is an intermediate step: mark all latin-1 fields as blobs (binary data) and then change them back again to UTF-8 encoded text fields. This solution works because MySQL doesn’t convert anything from latin-1 to blob, nor from blob to UTF-8. But is is quite labourous and you have to delete and recreate all indexes.
So here’s my solution: More »