UTF8/UTF16 and NVARCHAR2 string comparison failure

Discussion of open issues, suggestions and bugs regarding ADO.NET provider for Oracle
Post Reply
object
Posts: 91
Joined: Tue 26 Oct 2010 08:29

UTF8/UTF16 and NVARCHAR2 string comparison failure

Post by object » Mon 07 Nov 2011 11:58

We have been using our application with Oracle database using NLS_NCHAR_CHARACTERSET set to AL16UTF16, and everything worked fine.

Now we have connected it to a database that uses UTF8 and we are experiencing strage behavior.

A simple LINQ query that contains join with string comparison suddently returns an empty result set. Here is one:

--> FAIL: returns emtpy result set

Code: Select all

            string mediumtype = "tv";

            var currentSeries = from s in _entities.Series

                                join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
                                from tm in tmj.Where(m =>
                                        m.TitlesGroupType == TitlesGroupType.MainTitle)
                                where s.Medium.ToLower() == mediumtype.ToLower()

                                select s;
But if change mediumtype to a hardcoded string, it succeeds!

--> SUCCEDS

Code: Select all

            var currentSeries = from s in _entities.Series

                                join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
                                from tm in tmj.Where(m =>
                                        m.TitlesGroupType == TitlesGroupType.MainTitle)
                                where s.Medium.ToLower() == "tv"

                                select s;
The string data types are all NVARCHAR2, and here is the generated queries:

SELECT "GroupBy1".A1 AS C1
FROM (SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2"
ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID)
AND ("Extent2".TITLES_GROUP_TYPE = 'mainTitle' /* @p__linq__0 */)
WHERE (LOWER("Extent1".MEDIUM)) = (LOWER('tv' /* @p__linq__1 */))) "GroupBy1"

SELECT "GroupBy1".A1 AS C1
FROM (SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2"
ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID)
AND ("Extent2".TITLES_GROUP_TYPE = 'mainTitle' /* @p__linq__0 */)
WHERE (LOWER("Extent1".MEDIUM)) = 'tv') "GroupBy1"

Both queries work fine on a UTF16 database.

object
Posts: 91
Joined: Tue 26 Oct 2010 08:29

Post by object » Mon 07 Nov 2011 13:46

I did more tests and narrowed the problem: strange things happen when a query contains a string literal, e.g.

s == MyClass.SomeStringConstant

If I add ToLower() to both parts of the comparison, everything works as it should.

This only happens when Oracle database uses UTF8 for N(VAR)CHAR types.

Shalex
Site Admin
Posts: 9543
Joined: Thu 14 Aug 2008 12:44

Post by Shalex » Mon 07 Nov 2011 16:59

Please turn on dbMonitor and tell us the data type of the p__linq__1 parameter.

object
Posts: 91
Joined: Tue 26 Oct 2010 08:29

Post by object » Tue 08 Nov 2011 07:23

I've added to a test the following code:

var dbMonitor = new OracleMonitor();
dbMonitor.IsActive = true;

However, the only event I see in DB Monitor is "dotConnect for Oracle monitoring is started". I am running DB Monitor with default filter, i.e. it should display all events. The test code is running from within a unit test, I tried different unit test runners with no difference.

object
Posts: 91
Joined: Tue 26 Oct 2010 08:29

Post by object » Tue 08 Nov 2011 07:36

Oops, I incorrectly set up dbMonitor first, now it's working, and here's what I get:

1. When query fails (empty result set):
string mediumtype = "tv";

var currentSeries = from s in _entities.Series
join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
from tm in tmj
.Where(m => m.TitlesGroupType == TitlesGroupType.MainTitle)
where s.Medium.ToLower() == mediumtype.ToLower()
select s;
SELECT
"GroupBy1".A1 AS C1
FROM ( SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2" ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID) AND ("Extent2".TITLES_GROUP_TYPE = :p__linq__0)
WHERE (LOWER("Extent1".MEDIUM)) = (LOWER(:p__linq__1))
) "GroupBy1"

p__linq__0: NVarChar, mainTitle
p__linq__1: VarChar, tv

2. When query succeeds (non-empty result set):
var currentSeries = from s in _entities.Series
join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
from tm in tmj
.Where(m => m.TitlesGroupType == TitlesGroupType.MainTitle)
where s.Medium.ToLower() == "tv"
select s;
SELECT
"GroupBy1".A1 AS C1
FROM ( SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2" ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID) AND ("Extent2".TITLES_GROUP_TYPE = :p__linq__0)
WHERE (LOWER("Extent1".MEDIUM)) = 'tv'
) "GroupBy1"

p__linq__0: NVarChar, mainTitle

3. Another variant of the same query that succeds:
string mediumtype = "tv";

var currentSeries = from s in _entities.Series
join tm in _entities.SeriesTitles on s.SerieId equals tm.SerieId into tmj
from tm in tmj
.Where(m => m.TitlesGroupType.ToLower() == TitlesGroupType.MainTitle.ToLower())
where s.Medium.ToLower() == mediumtype.ToLower()
select s;
SELECT
"GroupBy1".A1 AS C1
FROM ( SELECT Count(1) AS A1
FROM SERIES "Extent1"
INNER JOIN SERIES_TITLES "Extent2" ON ("Extent1".SERIE_ID = "Extent2".SERIE_ID) AND ((LOWER("Extent2".TITLES_GROUP_TYPE)) = (LOWER(:p__linq__0)))
WHERE (LOWER("Extent1".MEDIUM)) = (LOWER(:p__linq__1))
) "GroupBy1"

p__linq__0: VarChar, mainTitle
p__linq__1: VarChar, tv

Note the difference in representation of p__linq__0 in variants 1 and 3! Looks like when VarChar and NVarChar parameters are mixed, query fails.

Shalex
Site Admin
Posts: 9543
Joined: Thu 14 Aug 2008 12:44

Post by Shalex » Wed 09 Nov 2011 11:37

1. Please send us a small test project with the corresponding DDL/DML script to reproduce the issue in our environment.
2. Specify:
a) the version and NLS_NCHAR_CHARACTERSET, NLS_CHARACTERSET, NLS_LANGUAGE parameters of your Oracle server;
b) the HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\%HomeName% > NLS_LANG setting of your Oracle client;
c) the exact (x.xx.xxx) version of your dotConnect for Oracle;
d) your connection string (roughly, without credentials). Have you tried using the "Unicode=true;" connection string parameter?

object
Posts: 91
Joined: Tue 26 Oct 2010 08:29

Post by object » Wed 09 Nov 2011 15:19

I have sent required information (schema, test code and environment data).

Looking forward to hear from you soon.

object
Posts: 91
Joined: Tue 26 Oct 2010 08:29

Post by object » Thu 10 Nov 2011 08:53

In the mail that I sent yesterday I didn't submit and test data, but it's simple to create: just insert some rows in SERIES table with MEDIUM set to 'tv' and 'radio', and some SERIES_TITLES rows with TITLES_GROUP_TYPE set to 'mainTitle' and 'secondaryTitle'. I think this should be sufficient.

object
Posts: 91
Joined: Tue 26 Oct 2010 08:29

Post by object » Mon 14 Nov 2011 09:41

I have donwloaded dotConnect 6.50.244 and these UTF8/UTF16 issues are gone. Everything works fine.

Thanks for the update.

Post Reply