Register    Login    Forum    Members   Search    FAQ

Board index » About this forum » Technical discussion




Post new topic Reply to topic  [ 6 posts ] 
Author Message
 Post subject: The World of Â
 Post Posted: Sat Mar 06, 2010 12:54 am 
Offline
Jedi Master
User avatar

Joined: Sun Dec 03, 2006 11:58 am
Posts: 3385
Location: Far from Japan
Yo. I've been rereading old topics last days and I noticed that for some reason, many posts, probably predating the second last update, have been butchered with those invasive  and perhaps some other mystical symbols.
Isn't there a script somewhere on Internet to clean up all posts automatically, something that would correct or remove those Â?
I suppose they can be safely removed since I don't think any of us has use  even once in our posts.


Top 
 Profile  
 
 Post subject: Re: The World of Â
 Post Posted: Tue May 04, 2010 6:57 pm 
Offline
Site Admin

Joined: Mon Aug 14, 2006 8:26 pm
Posts: 1865
What happened is that the character conversion when we went from PHPBB2 to PHPBB3 is that it butchered some "special" characters and the stuff surrounding them. I tried to fix as much as possible of it and tried a few different conversion techniques, but regardless of which "fix" I used, it seemed like some of the special characters the old board software supported got converted incorrectly.

In some of the older attempts at conversion, entire paragraphs were replaced by strings of special characters; what you see is the most minimally damaging conversion I could find, and I tried to spot and replace anything missing, but there really are too many of them for it to be worth the time.

If there's a specific post that you'd like to clean or restore, especially if you think some material has been lost, indicate it in this thread and I'll know what to look for in the old database backups.

If those characters are cropping up in posts made after the major 2=>3 update (that's to say when the skins change) that's a sign that the database is becoming corrupted, which would be a bad thing.


Top 
 Profile  
 
 Post subject: Re: The World of Â
 Post Posted: Tue May 04, 2010 9:25 pm 
Offline
Jedi Master
User avatar

Joined: Sun Dec 03, 2006 11:58 am
Posts: 3385
Location: Far from Japan
Bit late on that, s'rry.
Affected posts are only the old ones. It's very weird though, aside from ² or ³ I rarely used anything else, not even a good umlaut.

However it seems some " have been hit hard. It's quite a weird bug. I didn't know that all the text would be reinterpreted during an update. I thought text was stored in sort of immutable blocks and only the interface and functions would change.


Top 
 Profile  
 
 Post subject: Re: The World of Â
 Post Posted: Sat May 08, 2010 9:44 pm 
Offline
Jedi Master
User avatar

Joined: Sun Dec 03, 2006 11:58 am
Posts: 3385
Location: Far from Japan
I have an example here:

http://www.starfleetjedi.net/forum/view ... 181#p14181

Here is the list of letters and symbols that haven't made it through (RIP):

é
'
-

The bugs might have occurred when they were found next to other letters.
It's really weird that those didn't got recognized during the update. Not all posts seem to have been affected btw. If you can't even trust a template-update, I can only begin to imagine the nightmare it is to administrate a huge website.

Here's another one:
http://www.starfleetjedi.net/forum/view ... 859#p14859

I don't know what WILGA used here, perhaps "

Something similar here:

http://www.starfleetjedi.net/forum/view ... 858#p14858

There may be a randomness in how often the bug appeared, but when it did, it followed a pattern, as each group of weird letters can be deciphered back to what they were, and it seems to always be the same result for each single group.


Last edited by Jedi Master Spock on Sun May 09, 2010 1:34 am, edited 1 time in total.
Removed SIDs from URLs


Top 
 Profile  
 
 Post subject: Re: The World of Â
 Post Posted: Sun May 09, 2010 1:45 am 
Offline
Site Admin

Joined: Mon Aug 14, 2006 8:26 pm
Posts: 1865
(I've removed the SIDs from the URLs you posted. You generally don't want to post URLs including SIDs.)

One common element to many affected posts seemed to be that they were written in some other text editor, and then copy-pasted back onto the board. (One of the major offenders was a quotation symbol.) In some cases, the "normal" corresponding symbols made it through intact.

Doing a find-and-replace on the database can be a little bit of a pain, but it's doable if the symbol sequences don't overlap each other too much. IIRC, I did it for the ones I'd noticed back at the time of the conversion.


Top 
 Profile  
 
 Post subject: Re: The World of Â
 Post Posted: Sun May 09, 2010 3:10 am 
Offline
Jedi Master
User avatar

Joined: Sun Dec 03, 2006 11:58 am
Posts: 3385
Location: Far from Japan
So an outside text editor does attaches some invisible data to the text.
I thought the phpbb system automatically filtered all bizarre symbols and letters.
If there's no automatic process to clean up all the database's posts, don't bother. It doesn't make the posts unreadable.


Top 
 Profile  
 
Display posts from previous:  Sort by  
 
Post new topic Reply to topic  [ 6 posts ] 

Board index » About this forum » Technical discussion


Who is online

Users browsing this forum: No registered users and 1 guest

 
 

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: