All right, let me explain. Some of you may have noticed some site downtime over the last week and a half or so. Basically, I ran into some trouble when I tried to use the “auto-upgrade” function provided by my web host to update WordPress to the version I was already running. No one cares about that, but let’s just say that it managed to screw up some databases and left the site with numerous odd characters scattered about.
Fixing those weird little symbols has been quite a pain, but I think I’ve got it pretty well under control. A thorough search of the WordPress forums helped me see that this is a very common problem. It has to do with the encoding of the database that stores all of the information, and it seems my database was a different version than WordPress expected. Still, the “export XML” function of WordPress is pretty poor, and I see why it’s not recommended as anything more than a last-ditch effort to recover data.
Here’s what was wrong: old posts and comments that had text copied from another source, or had odd accented characters, were showing up wrong. So I made myself a conversion guide, and set about using “Find and Replace” in Wordpad to clean everything up. This took several hours. I then made a new WordPress install, used a plugin to convert my database to UTF-8 (from the default of Latin-1, for some reason), imported the XML file, restored my uploaded images, and checked the posts for remaining errors.
I really only got into trouble because I tried to use the XML file to restore everything, and it’s crap. So there shouldn’t be issues unless I have to move the site again, which is possible but – hopefully – unlikely. Here are some tips for writing comments and posts that will keep the number of odd characters down in the event of a future move:
- Only write in the browser window, using UTF-8 encoding.
- Do not copy and paste directly from another source without first passing the text through Notepad.
- Do not double-space between sentences (I had this habit from my typing class, but apparently it’s not good Internet etiquette).
- Of course, minimize the use of foreign language characters.