I've read the following thread and I was able to make a conversion script (based on C#) that converts all my charset=NONE databases to charset=UTF8 and most of it works great (I still have a few special cases where characters are converted to weird symbols, but that's marginal).
My issue is that I have lots of backup database files (*.fbk) for which I don't know for sure if this is UTF8 or NONE. In the ideal world, my code would handle the conversion once the database is restored from file depending on the fbk file's format, so I only convert when necessary and after restore.
Is this at all possible? Or is there a way to define charset when restoring the database (either via gback of via ADO.NET provider)?
In general, a Firebird database does not have a single character set. Each and every column can have its own character set. So the only thing you can do is try and use heuristics.
Use the database default character set. To be clear, the database default character set is only used when creating a new column when no explicit character set is specified. It is entirely possible for a database to have default character set UTF8, while all columns have character set WIN1251!
You can find the database default character set with the following query:
NOTE: If the result is
NULL, then that means the default character set is NONE.Count the different character sets of CHAR, VARCHAR and BLOB SUB_TYPE TEXT columns to see which occurs most:
As an aside, be aware that if clients have used connection character set NONE, then it is entirely possible that the actual character set of contents of a column may not match the defined character set of that column.