Mailing List Archive

db_get_main_header/follow-up on performance
db_get_main_header() retrieves all of the messageblks. Shouldn't there
be a LIMIT 1 in the query?

A few days ago I mentioned problems I was having with dbmail-imapd being
very slow with large messages. It turns out this was not so much a
dbmail problem but a problem with user-mode linux under which I am
running dbmail. I tracked the long delay to a call to
mysql_store_result() in db_query. With a large recordset this sometimes
takes a very long time. There seems to be an issue with networking
within uml.
I wonder if there might be other situations in which
mysql_store_result() is very slow and mysql_use_result() should be used
instead.
Any thoughts?

xn
Re: db_get_main_header/follow-up on performance [ In reply to ]
On Friday, Oct 10, 2003, at 22:23 Europe/Amsterdam, Christian G. Warden
wrote:

> db_get_main_header() retrieves all of the messageblks. Shouldn't there
> be a LIMIT 1 in the query?

LIMIT is not valid SQL-92 as far as I know, which would make it
difficult
or impossible to use the same code using another RDBMS (Oracle for
instance)
>
> A few days ago I mentioned problems I was having with dbmail-imapd
> being
> very slow with large messages. It turns out this was not so much a
> dbmail problem but a problem with user-mode linux under which I am
> running dbmail. I tracked the long delay to a call to
> mysql_store_result() in db_query. With a large recordset this
> sometimes
> takes a very long time. There seems to be an issue with networking
> within uml.
> I wonder if there might be other situations in which
> mysql_store_result() is very slow and mysql_use_result() should be used
> instead.
> Any thoughts?
I have no clue..

Ilja
--
IC&S
Koningsweg 4
3582 GE UTRECHT
Re: db_get_main_header/follow-up on performance [ In reply to ]
On Sun, Oct 12, 2003 at 10:45:04PM +0200, Ilja Booij wrote:
> On Friday, Oct 10, 2003, at 22:23 Europe/Amsterdam, Christian G. Warden
> wrote:
>
> >db_get_main_header() retrieves all of the messageblks. Shouldn't there
> >be a LIMIT 1 in the query?
>
> LIMIT is not valid SQL-92 as far as I know, which would make it
> difficult
> or impossible to use the same code using another RDBMS (Oracle for
> instance)

But it is getting too many rows, right? Perhaps a select_limit function
should be written for each sql driver. Or, even easier, why not add a
header flag to the messageblks table?

xn
Re: db_get_main_header/follow-up on performance [ In reply to ]
On Sunday, Oct 12, 2003, at 23:04 Europe/Amsterdam, Christian G. Warden
wrote:

> On Sun, Oct 12, 2003 at 10:45:04PM +0200, Ilja Booij wrote:
>> On Friday, Oct 10, 2003, at 22:23 Europe/Amsterdam, Christian G.
>> Warden
>> wrote:
>>
>>> db_get_main_header() retrieves all of the messageblks. Shouldn't
>>> there
>>> be a LIMIT 1 in the query?
>>
>> LIMIT is not valid SQL-92 as far as I know, which would make it
>> difficult
>> or impossible to use the same code using another RDBMS (Oracle for
>> instance)
>
> But it is getting too many rows, right?
yes it is.
> Perhaps a select_limit function
> should be written for each sql driver. Or, even easier, why not add a
> header flag to the messageblks table?
Hmm, this last suggestion is rather nice I think. (I like easy
solutions!)

ilja

--
IC&S
Koningsweg 4
3582 GE UTRECHT
Re: db_get_main_header/follow-up on performance [ In reply to ]
On Sun, 2003-10-12 at 17:08, Ilja Booij wrote:
> > But it is getting too many rows, right?
> yes it is.
> > Perhaps a select_limit function
> > should be written for each sql driver. Or, even easier, why not add a
> > header flag to the messageblks table?
> Hmm, this last suggestion is rather nice I think. (I like easy
> solutions!)

Shouldn't the headers be separated from the message body? It would
solve the problem above, and probably improve performance. There should
be a message_headers table that just holds the part of the message body
that contains the headers.

This would solve the problem of selecting too many rows, it would
probably be faster since the message headers are usually much smaller
than the size of a full message_blk, and it fixes the problem of the
message headers exceeding the size of a message_blk which currently
would cause problems.

Matthew
Re: db_get_main_header/follow-up on performance [ In reply to ]
On Sun, Oct 12, 2003 at 05:18:37PM -0400, Matthew T. O'Connor wrote:
> On Sun, 2003-10-12 at 17:08, Ilja Booij wrote:
> > > But it is getting too many rows, right?
> > yes it is.
> > > Perhaps a select_limit function
> > > should be written for each sql driver. Or, even easier, why not add a
> > > header flag to the messageblks table?
> > Hmm, this last suggestion is rather nice I think. (I like easy
> > solutions!)
>
> Shouldn't the headers be separated from the message body? It would
> solve the problem above, and probably improve performance. There should
> be a message_headers table that just holds the part of the message body
> that contains the headers.

Since there is only one set of headers per message, why not add a
headers field to physmessage?

> This would solve the problem of selecting too many rows, it would
> probably be faster since the message headers are usually much smaller
> than the size of a full message_blk, and it fixes the problem of the
> message headers exceeding the size of a message_blk which currently
> would cause problems.

Does anyone have any insight into why messages are split into
messageblks? The only MySQL limit I could find that might cause
problems is max_allowed_packet, but that can be configured. I did try
increasing READ_BLOCK_SIZE, and got a memory allocation error from
dbmail-smtp, but I assume that can be fixed.

xn
Re: db_get_main_header/follow-up on performance [ In reply to ]
From: "Christian G. Warden" <xndbmail@xerus.org>
> On Sun, Oct 12, 2003 at 05:18:37PM -0400, Matthew T. O'Connor wrote:
> > Shouldn't the headers be separated from the message body? It would
> > solve the problem above, and probably improve performance. There should
> > be a message_headers table that just holds the part of the message body
> > that contains the headers.
>
> Since there is only one set of headers per message, why not add a
> headers field to physmessage?

If I understand physmessage correctly, nothing in physmessage is touched
during a move (copy / delete) or a plain copy. If that's the case then I
think putting the headers in physmessage is fine.

> Does anyone have any insight into why messages are split into
> messageblks? The only MySQL limit I could find that might cause
> problems is max_allowed_packet, but that can be configured. I did try
> increasing READ_BLOCK_SIZE, and got a memory allocation error from
> dbmail-smtp, but I assume that can be fixed.

This is one of the design decisions that surprised me when I first started
using dbmail, I'm not sure why it's laid out this way either. The only
thing I can think of is that it allows large messages to be dealt with in
small chunks rather than in one big chunk.
Re: db_get_main_header/follow-up on performance [ In reply to ]
On Monday, Oct 13, 2003, at 02:32 Europe/Amsterdam, Matthew T. O'Connor
wrote:

> From: "Christian G. Warden" <xndbmail@xerus.org>.
>> Since there is only one set of headers per message, why not add a
>> headers field to physmessage?
>
> If I understand physmessage correctly, nothing in physmessage is
> touched
> during a move (copy / delete) or a plain copy. If that's the case
> then I
> think putting the headers in physmessage is fine.
You'r correct. The only time a physmessage entry is touched is on
INSERTion and the following UPDATE where stuff like messagesize is put
in. After that, the records are left unchanged until deleted (which only
happens if there are no more messages using a physmessage record
>
>> Does anyone have any insight into why messages are split into
>> messageblks? The only MySQL limit I could find that might cause
>> problems is max_allowed_packet, but that can be configured. I did try
>> increasing READ_BLOCK_SIZE, and got a memory allocation error from
>> dbmail-smtp, but I assume that can be fixed.
>
> This is one of the design decisions that surprised me when I first
> started
> using dbmail, I'm not sure why it's laid out this way either. The only
> thing I can think of is that it allows large messages to be dealt with
> in
> small chunks rather than in one big chunk.
This was done because of some limit in MySQL I believe. I'll have to ask
Roel and/or Eelco why exactly this was done.

Ilja
--
IC&S
Koningsweg 4
3582 GE UTRECHT
Re: db_get_main_header/follow-up on performance [ In reply to ]
On Sunday, Oct 12, 2003, at 23:04 Europe/Amsterdam, Christian G. Warden
wrote:

> On Sun, Oct 12, 2003 at 10:45:04PM +0200, Ilja Booij wrote:
>> On Friday, Oct 10, 2003, at 22:23 Europe/Amsterdam, Christian G.
>> Warden
>> wrote:
>>
>>> db_get_main_header() retrieves all of the messageblks. Shouldn't
>>> there
>>> be a LIMIT 1 in the query?
>>
>> LIMIT is not valid SQL-92 as far as I know, which would make it
>> difficult
>> or impossible to use the same code using another RDBMS (Oracle for
>> instance)
>
> But it is getting too many rows, right? Perhaps a select_limit
> function
> should be written for each sql driver. Or, even easier, why not add a
> header flag to the messageblks table?
I've rewritten the query to only return the lowest messageblk_idnr for
a message using select min(messageblk_idnr).. etc

the update is in CVS (2.0 branch)

Ilja

--
IC&S
Koningsweg 4
3582 GE UTRECHT