[Info-ingres] replacing special characters eg ä in SQL with a unicode script u'ä'
Chris Clark
chris.clark at actian.com
Thu Apr 14 15:46:47 UTC 2016
On Thursday, April 14, 2016 at 7:24:03 AM UTC-7, Allan Biggs wrote:
> Is there a way of replacing characters
> such as ä
> (a umlaut) as part of an
> SQL query.
>
> We are sometimes fed these characters
> by another system,
>
> in my case I am producing a python script
> which for ä needs to
> be represented by the string u'ä'
> or substituted for a none unicode character.
Ingres will translit between varchar and nvarchar. So Ingres considers (using your syntax) 'ä' and u'ä' as comparable (assuming the character is encoded correctly in the declared II_CHARSETxx encoding).
Depending on which version of Python 2.x or 3.x series they may or may not compare the same way (depending on locale, encoding, etc.).
>
> clearly I can do something like
>
> update filename_w_asd set file_name
> = replace(file_name,'ä','''ä''');
>
> \p\g
>
> but is there a more generic way to do
> this?
>
> Allan
I'm not sure you need to do this at all. Unless you are trying to convert something into 7-bit clean US-ASCII, i.e. 'ä' into 'a' (plain a). There isn't a good way to do this in Ingres (you could use a NFC database and strip all non-ascii characters but again, not a good way). Doing this as part of ETL (e.g. with a Python script) is probably your best bet here.
If you just want unicode strings in Python, then select using an nvarchar cast. E.g.
# Assuming jyjdbc or (py)pyodbc driver using Ingres JDBC or ODBC driver
c.execute("SELECT nvarchar('ä')") # NOTE ensure python encoding set correctly
c.execute("SELECT nvarchar(?)", ('ä',)) # NOTE ensure python encoding set correctly
More information about the Info-ingres
mailing list