Eating server memory

Discussion of open issues, suggestions and bugs regarding ADO.NET provider for PostgreSQL
Post Reply
janagn
Posts: 6
Joined: Thu 24 Nov 2011 11:34

Eating server memory

Post by janagn » Thu 24 Nov 2011 12:01

Hi all,

This is my first post here so I hope that the issue has not been discussed again. If so, please accept my apologises. Here is my problem:

I have a Postgres database with millions of rows and a prepared Command that I execute over and over again changing the parameters to retrieve specific rows. In fact this command is usually executed around 20 million times. Now I know that possibly I could reduce those iteration, but before I do this I need to make sure that what I observer is normal. Here is what I observe:

top - 11:46:05 up 85 days, 2:41, 1 user, load average: 0.67, 1.03, 1.19
Tasks: 89 total, 1 running, 88 sleeping, 0 stopped, 0 zombie
Cpu(s): 10.3%us, 0.7%sy, 0.0%ni, 69.8%id, 19.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8070900k total, 8017232k used, 53668k free, 1508k buffers
Swap: 7811068k total, 4283720k used, 3527348k free, 2088528k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15965 postgres 20 0 12.9g 6.4g 1.4g S 11 83.4 13:59.15 postgres
15961 postgres 20 0 4285m 1.8g 1.8g D 31 23.2 9:35.03 postgres
15827 postgres 20 0 4276m 52m 51m S 0 0.7 0:00.50 postgres
15830 postgres 20 0 4278m 25m 24m S 0 0.3 0:01.27 postgres
15959 postgres 20 0 4288m 4376 4288 S 0 0.1 0:00.25 postgres
15832 postgres 20 0 4279m 1388 888 S 0 0.0 0:00.19 postgres

This is an extract from top and below is an extract of iostat:

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 158.00 1802.00 1602.00 3604 3204
sdb 70.50 590.00 40.00 1180 80
sdc 16.00 528.00 0.00 1056 0

So, after 2 hours of run, the postgres connection process (15965) has consumed all the server's memory and has start swapping (sda is my swap hd). This has as a result the degradation of my retrieval time. When I started it was 0.001sec per execution and now it has gone up to 0.01.

My question is why my process has consumed so much memory since I have a prepared command that I simply execute and normally close the reader after all. Are these cached data that are accumulating in the memory accessed by the process? And if yes, how can I clear them as I will obviously not use them again.

I have to admit that I was using before the freeware npgsql ado.net "driver" and same behaviour was observed there as well. I jumped here hoping that this issue might go away. So I am not sure if this is a postgres "issue" or a ado.net flavour issue or an issue all togather! I would expect though somehow to be able to clear this "memory" by closing the connection and disposing it every so often. Would this be a correct course of action?

Thank you in advance
Yiannis

Shalex
Site Admin
Posts: 9543
Joined: Thu 14 Aug 2008 12:44

Post by Shalex » Mon 28 Nov 2011 15:16

janagn wrote:When I started it was 0.001sec per execution and now it has gone up to 0.01.
Try using Protocol 2.0 (the Protocol=2; connection string parameter, by default 3) - its performance is better for simple requests.

If the results with dotConnect for PostgreSQL and NPGSQL are the same, there are possible reasons:
- incorrect code
- memory is eaten by server side processing
- the size of consumed memory is OK (determined by program data).

You can send us a small test project with the corresponding DDL/DML script so that we can reproduce the issue in our environment.

janagn
Posts: 6
Joined: Thu 24 Nov 2011 11:34

No change

Post by janagn » Fri 02 Dec 2011 20:26

Hello and thank you for the answer. I tried to use protocol 2 but there was no improvement. I also simplified my code a lot trying to put together an example to send you in case you could shed some light. In the process of putting together the example I found out the following interesting point:
Say that you open a connection and you keep it open for throughout the iteration I was talking before. Even if you don't use it i.e. no queering anything, just by keeping it open, its memory consumption on the server keeps on growing. Not as much as when querying but for example after 1:25 hours the connection process has grown to 2.4% or 191m resident memory. Providing that the connection is idle I don't see why is this happening. Please check the email you provided, I have posted you the vb code that produces these results.

Any help will be much appreciated.
Kind regards
Yiannis

s_tristan
Posts: 9
Joined: Wed 06 Apr 2005 15:34

Post by s_tristan » Sat 03 Dec 2011 20:10

I'm also interesting for DevArt comments on this issue.

janagn
Posts: 6
Joined: Thu 24 Nov 2011 11:34

Post by janagn » Sat 03 Dec 2011 21:56

s_tristan wrote:I'm also interesting for DevArt comments on this issue.
Hi, as I said this issue appears on the freeware NPGSQL ado.net driver. That makes me feel that either this is a common bug or it is not related to the ado driver but to the postgres itself. However I have a small hint as well. The jdbc driver for java had a similar issue that was related to server notices about the running queries that had not been read or cleared. Is there any possibility that such messages are accumulated to the server side, thus increasing the memory consumption because nobody is clearing them? As I am running almost 22million queries within a single connection the issue is more apparent than executing a few of them and then closing the connection. Finally I have the feeling that if I turn my queries to cursors the issue will disappear. The question still remains thought.

Thank you again
Yiannis

janagn
Posts: 6
Joined: Thu 24 Nov 2011 11:34

Descision before we purchase

Post by janagn » Mon 05 Dec 2011 10:27

Hello all, this is one more post regarding this issue as I had a very interesting weekend. I hope you all understand that I am running the 30days trial version and the resolution of this issue will dominate the decision as to whether we proceed with the purchase of the product or not.

I am in a situation right now that I have a three way communication with Postgres support, Devart and Npgsql and by look of it, everybody assumes that it is the other's problem and quite possible they all think that it is my code issue. However here are the facts:

1. Single prepared statement/command if executed a lot of times (I mean a lot!!!) it eats server memory.
2. It is a single statement/command and NO I am not creating a new one every time. In my code examples the Command/Statement is declared STATIC/SHARED on the class and initialised if and only if the static is NULL/Nothing.
3. My example case is that I have 55000 rows and every row has an average 400 connected rows via the foreign key. Total: 22.000.000
3a. If (for some reason) I am forced to query 22.000.000 times, my system will run out of memory because executing the prepared Command so many times will do so.
3b. If (as I am doing in other cases) I query my system 55.000 times by altering the query to accept groups of 400 rows in an array parameter then the system IS NOT running out of memory BUT it is still apparent that it eats up memory. There is no logic or at least no reason that I can see that a prepared command would eat 8.8% of server's memory. What is going to happen if 50 people try to run the same creating 50 connections? (process 22103 is the connection)

top - 10:07:20 up 96 days, 1:02, 1 user, load average: 0.01, 0.02, 0.05
Tasks: 87 total, 1 running, 86 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.1%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8070900k total, 8020612k used, 50288k free, 56296k buffers
Swap: 7811068k total, 4336k used, 7806732k free, 7675232k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22103 postgres 20 0 2187m 701m 697m S 0 8.9 1:29.16 postgres

So what are the possible solutions or options here:
1. This is a postgres normal behaviour. I may not like it but maybe it is. I guess that the only way around it is to close and open connections to clear up memory
2. This is a problem on postgres and there is a memory leak of some kind.
3. This is a common bug both in Devart's code and in the Freeware NPGSQL code. As it is not very common to execute a select 22million times and you may by pass the issue by opening and closing a connection nobody notices it.
3a. The postgres people mentioned quite a few times the term "Do you dispose the command/statement object"? Not sure how am I suppose to do this in ado.net as calling dispose on the command object is not making things better. Besides I have ONLY one prepared statement. If I need to dispose it I will at the end of the iteration.
3b. In the Java World, there was a similar case where the CONNECTION object was accumulating "server notices" that you had to clear by your self manually from within the client's code. Is there a possibility that both Devart and NPGSQL simply does not provide this facility though their APIs thus the "server notices" are accumulated on the server waiting to be read?

I list below all my post to postgres mailling lists and to NPGSQL for reasons of completion. Any help and idea will be much appreciated:

Postgress Novice Thread

NPGSQL Forum

As I said any support or hint towards the right direction will be much appreciated.

Kind Regards
Yiannis

Shalex
Site Admin
Posts: 9543
Joined: Thu 14 Aug 2008 12:44

Post by Shalex » Tue 06 Dec 2011 14:08

1. As we understood, this is a PostgreSQL service (postgres.exe) which consumes a lot of memory when OutOfMemory occurs in your environment. Is this correct? Please open Windows Task Manager and check which process eats the memory - server's one or the one of your application.

2. We have slightly modified your application (it was sent to you by e-mail): only one global connection is used now, and the Dispose() method is called for PgSqlDataReader and PgSqlCommand. Try it in your environment.

janagn
Posts: 6
Joined: Thu 24 Nov 2011 11:34

Problem solved!!!

Post by janagn » Wed 07 Dec 2011 09:13

Hello all,

I think I have solved the problem. Many thanks to the devart people for their support and the time they spend. The solution was as follows:

1. There was one connection that as I described was used IN A LOOP 22million times. This connection was assigned a PID x (on the linux server)
2. Nested within this LOOP there was another connection that had been forgotten from past code and the linux server was assigning to it a PID y
3. PID y was of course called also 22million times (since it was in the loop). However it had a nasty bug and it was creating constantly prepared commands! (opps my mistake :oops: ). So PID y was creating 22million prepared commands!
4. As I had no clue that that there was at all PID y, monitoring the TOP on the server I was presented with the misbehaving PID y but I was of the impression that it was PID x. In fact PID x was below in the list happy doing its own job.

So the healthy PID X had a top signature as follows (please note the difference between RES and SHR as well as the magnitude in Mb):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30475 postgres 20 0 2187m 746m 741m S 31 9.5 0:41.48 postgres

While the unhealthy PID Y had a TOP signature (please note that RES memory is at 12.9g! and SHR 1.4g as well as the magnitude in Gb!):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15965 postgres 20 0 12.9g 6.4g 1.4g S 11 83.4 13:59.15 postgres

As I said I had no clue about the existence of PID Y and since it was coming top at the TOP list I had wrongfully assumed that it was the PID X.

I hope this explains everything.
Kind Regards and sorry for the misunderstanding
Yiannis

Post Reply