I need to draw a barcode (QR) with Delphi 6/7. The program can run in various windows locales, and the data is from an input box.
On this input box, the user can choose a charset, and input his own language. This works fine. The input data is only ever from the same codepage. Example configurations could be:
- Windows is on Western Europe, Codepage 1252 for ANSI text
- Input is done in Shift-JIS ANSI charset
I need to get the Shift-JIS across to the barcode. The most robust way is to use hex encoding.
So my question is: how do I go from Shift-JIS to a hex String in UTF-8 encoding, if the codepage is not the same as the Windows locale?
As example: I have the string 能ラ
. This needs to be converted to E883BDE383A9
as per UTF-8. I have tried this but the result is different and meaningless:
String2Hex(UTF8Encode(ftext))
Unfortunately I can't just have an inputbox for WideStrings. But if I can find a way to convert the ANSI text to a WideString, the barcode module can work with Unicode Strings as well.
If it's relevant: I am using the TEC-IT TBarcode DLL.
Creating and accessing a Unicode text control
This is easier than you may think and I did so in the past with the brand new Windows 2000 when convenient components like Tnt Delphi Unicode Controls were not available. Having background knowledge on how to create a Windows GUI program without using Delphi's VCL and manually creating everything helps - otherwise this is also an introduction of it.
First add a property to your form, so we can later access the new control easily:
Now just create it at your favorite event - I chose
FormCreate
:At some point you want to get the edit's content. Again: how is this done without Delphi's VCL but instead directly with the WinAPI? This time I used a button's
Click
event:There are detail issues, like not being able to reach this new control via Tab, but we're already basically re-inventing Delphi's VCL, so those are details to take care about at other times.
Converting codepages
The WinAPI deals either in codepages (Strings) or in UTF-16 LE (WideStrings). For historical reasons (UCS-2 and later) UTF-16 LE fits everything, so this is always the implied target to achieve when coming from codepages:
The source codepage is up to you: maybe
1252
for "Windows-1252" = ANSI Latin 1 Multilingual (Western Europe)932
for "Shift-JIS X-0208" = IBM-PC Japan MIX (DOS/V) (DBCS) (897 + 301)28595
for "ISO 8859-5" = Cyrillic65001
for "UTF-8"However, if you want to convert from one codepage to another, and both source and target shall not be UTF-16 LE, then you must go forth and back:
As per every Windows installation not every codepage is supported, or different codepages are supported, so conversion attempts may fail. It would be more robust to aim for a Unicode program right away, as that is what every Windows installation definitly supports (unless you still deal with Windows 95, Windows 98 or Windows ME).
Combining everything
Now you got everything you need to put it together:
Size
UTF-8 is mostly the best choice, but size wise UTF-16 may need fewer bytes in total when your target audience is Asian: in UTF-8 both
能
andラ
need 3 bytes each, but in UTF-16 both only need 2 bytes each. As per your QR barcode size is an important factor, I guess.Likewise don't waste by turning binary data (8 bits per byte) into ASCII text (displaying 4 bits per character, but itself needing 1 byte = 8 bits again). Have a look at Base64 which encodes 6 bits into every byte. A concept that you encountered countless times in your life already, because it's used for email attachments.