UTF sorting in sqlite databases

Discussion of open issues, suggestions and bugs regarding UniDAC (Universal Data Access Components) for Delphi, C++Builder, Lazarus (and FPC)
sandy771
Posts: 194
Joined: Tue 22 May 2007 13:57

UTF sorting in sqlite databases

Post by sandy771 » Mon 14 Jun 2010 22:52

Is this supported in the unidac product?

Edit to add that I have stCaseInsensitive set in TuniConnection but my tables are still sorted as case sensitive.

The sqlite faq (http://www.sqlite.org/faq.html#q18) does say that in order to provide unicode sorting external routines would need to be added. As I am working through the devart routines I assume I need to rely on Devart to implement this for me?

If there is a way of achieving case senstive sorting (pretty essential for my app) then could you please supply an example.
Last edited by sandy771 on Tue 15 Jun 2010 15:26, edited 1 time in total.

bork
Devart Team
Posts: 649
Joined: Fri 12 Mar 2010 07:55

Post by bork » Tue 15 Jun 2010 15:20

Hello

When you write query like the following:

Code: Select all

  select *
    from my_table
    order by name
Then sorting is executed on the SQLite server (usually it is sqlite3.dll) and UniDAC gets already sorted data. As I know SQLite has some problems with operators LIKE and ORDER BY when text contains chars in the UTF encoding. But unfortunately we cannot change behavior of SQLite. If you don't like behavior of ORDER BY operator then you should ask SQLite developers to fix it or you can try to find some SQLite extension that fixes this problem.

sandy771
Posts: 194
Joined: Tue 22 May 2007 13:57

Post by sandy771 » Tue 15 Jun 2010 15:26

Hi Bork

Thanks for the answer.

If you read the faq linked to in my previous email it seems that YOU can implement sorting by overriding the sqlite inbuilt collation routines - as far as I can see I cant do this.

The sqlite devlepers have added a route for you to do this as part of the unidac controls, but becuase I am accessing via your controls this route seems to be denied to me.

Could I please ask that you look at this and see about implementing a fix.

The default configuration of SQLite only supports case-insensitive comparisons of ASCII characters. The reason for this is that doing full Unicode case-insensitive comparisons and case conversions requires tables and logic that would nearly double the size of the SQLite library. The SQLite developers reason that any application that needs full Unicode case support probably already has the necessary tables and functions and so SQLite should not take up space to duplicate this ability.

Instead of providing full Unicode case support by default, SQLite provides the ability to link against external Unicode comparison and conversion routines. The application can overload the built-in NOCASE collating sequence (using sqlite3_create_collation()) and the built-in like(), upper(), and lower() functions (using sqlite3_create_function()). The SQLite source code includes an "ICU" extension that does these overloads. Or, developers can write their own overloads based on their own Unicode-aware comparison routines already contained within their project.

sandy771
Posts: 194
Joined: Tue 22 May 2007 13:57

Post by sandy771 » Wed 16 Jun 2010 09:54

It seems to me that this is an issue that you can easily resolve and one that the developers of sqlite have left for you to resolve.

It does not seem to be appropriate for ME to have to ask the sqlite developers to fix something when they have clearly left it for you as the developer of the DAC.

Could you please confirm that you will be implementing a UTF sort option - if you won't then I need to be looking for a solution to this problem and I dont really want to have to spend my money on a different DAC package.

Please support your customers.

bork
Devart Team
Posts: 649
Joined: Fri 12 Mar 2010 07:55

Post by bork » Wed 16 Jun 2010 13:06

We have read this FAQ before and we have investigated the possibility to add this functionality to UniDAC. If you want to have this functionality inside UniDAC then we will try to add it in the next UniDAC build.

sandy771
Posts: 194
Joined: Tue 22 May 2007 13:57

Post by sandy771 » Wed 16 Jun 2010 13:38

bork wrote:We have read this FAQ before and we have investigated the possibility to add this functionality to UniDAC. If you want to have this functionality inside UniDAC then we will try to add it in the next UniDAC build.
Bork - Thank You very much - I look forward to it.

sandy771
Posts: 194
Joined: Tue 22 May 2007 13:57

Post by sandy771 » Tue 13 Jul 2010 12:10

Hi Bork

Not chasing but just wondered if you had an approximate timescale for the release of the unidac build including this feature

bork
Devart Team
Posts: 649
Joined: Fri 12 Mar 2010 07:55

Post by bork » Tue 13 Jul 2010 12:47

Hello

This feature was added and will be included in the next UniDAC build.

To register you own collation function you should add the LiteClassesUni unit to the USES section and call the RegisterCollation function.

sandy771
Posts: 194
Joined: Tue 22 May 2007 13:57

Post by sandy771 » Tue 13 Jul 2010 14:55

Sorry Bork

Are you saying that when the new version is released I will need to add my own collate function or that I can add the collate function to the current version?

when will the new version be out?

thank You

bork
Devart Team
Posts: 649
Joined: Fri 12 Mar 2010 07:55

Post by bork » Wed 14 Jul 2010 15:01

Hello

The next version of UniDAC will be released in 1-2 weeks.

sandy771
Posts: 194
Joined: Tue 22 May 2007 13:57

Post by sandy771 » Thu 15 Jul 2010 13:33

Excellent - thanks

sandy771
Posts: 194
Joined: Tue 22 May 2007 13:57

Post by sandy771 » Wed 18 Aug 2010 09:08

New build of UniDAC version 3.00.0.11 is available for download now.
This version includes:

* Added UTF sorting for the SQLite database
Thanks for this guys - now how do I implement it. Selecting foCaseInsenstive = true in the filter options makes no difference when sorting a column all lower case follow all the upper case

bork
Devart Team
Posts: 649
Joined: Fri 12 Mar 2010 07:55

Post by bork » Thu 19 Aug 2010 12:02

Hello

If you want to use UTF sorting for SQLite, you can use the following sample:

Declare callback function for the collation:

Code: Select all

function collation_callback(UserData: Pointer;
  l1: Integer; const s1: Pointer;
  l2: Integer; const s2: Pointer): Integer; cdecl;

  function GetString(Buffer: Pointer; Len: integer): string;
  begin
    SetLength(Result, Len);
    Move(Buffer^, Result[1], Len);
    Result := UTF8Decode(Result)
  end;

var
  wstr1, wstr2: WideString;
begin
  wstr1 := GetString(s1, l1);
  wstr2 := GetString(s2, l2);

  wstr1 := WideUpperCase(wstr1);
  wstr2 := WideUpperCase(wstr2);

  if wstr1 > wstr2 then
    Result := 1
  else if wstr1 < wstr2 then
    Result := -1
  else
    Result :=  0;
end;
Register callback function for the collation :

Code: Select all

RegisterCollation(TUniUtils.GetCRConnection(UniConnection1), 'SYSTEMNOCASE', SQLITE_UTF8, nil , collation_callback);
Use query with this collation:

Code: Select all

select *
  from BTEST
  order by name COLLATE SYSTEMNOCASE
To use this code you should add LiteClassesUni and LiteCallUni units to the USES section.

sandy771
Posts: 194
Joined: Tue 22 May 2007 13:57

Post by sandy771 » Mon 18 Oct 2010 18:52

I have been unavoidably away for a while and have not, until now, had time to try and implement this.

I have the code below. When I execute RegisterCollation I get an access violation read of address 00000000. Any ideas what I might be doing wrong?



RegisterCollation(TUniUtils::GetCRConnection(UniConnection1), "SYSTEMNOCASE", SQLITE_UTF8, NULL, collation_callback);

and

int __cdecl collation_callback(void* userdata, int l1, const void *s1, int l2, const void *s2)
{
WideString w1, w2;

w1 = WideUpperCase(UTF8Decode((char*)s1));
w2 = WideUpperCase(UTF8Decode((char*)s2));
if(w1 > w2)
return 1;
else
if( w1 < w2)
return -1;
else
return 0;
}

AlexP
Devart Team
Posts: 5530
Joined: Tue 10 Aug 2010 11:35

Post by AlexP » Wed 20 Oct 2010 08:14

Hello,

I tried execute your example, and it works correctly.
I tested it on C++Builder 2010 with UniDAC 3.50.0.12.
Please specify your version of IDE and UniDAC, or send me the example project to demonstrate the problem to alexp*devart*com.

Post Reply