What's so special about ø and è¿ Absolutely nothing until you try to work with them. A large speed hump from a few weeks back was internationalization, or i18n to its friends. Why i18n? because there are 18 characters between the i and n. But I digress.
For reasons quite unknown to us, Puakma is very popular in Europe. With our other products in the past, the US was clear winner in terms of customers. i18n was something we skimmed over as Australia has no "special" characters, ASCII text works absolutely fine for us and really for most of the world the first 127 characters are the same. It's only when you go above 127 into the special zone things get very confusing.
For the most part Java does a brilliant job of keeping everything in its correct character set. Data is always stored as UNICODE inside strings. But as we manipulate that data sometimes it needs to be put into byte arrays or byte streams, like when we send data to the client, receive data from the client or serialize some object to a disk file (like exporting a web application to a .pmx file). Stepping from the safety of UNICODE to the wild frontier of bytes is very, very dangerous. Streams must be read with one character set, manipulated in UNICODE and wirtten in a another character set. One false move and your ø becomes a ?.
Western languages are now handled correctly but we have no facilities for testing more extreme character sets, like chinese, hebrew, arabic etc. Like everything, I know this will come up in the future and we'll cross that bridge when we get there. For now we have probably >85% of our potential customer base covered, and that will do while we round out the IDE.
Dear IDE, my how there's so much riding on you.....