Mailing List Archive

some speed tests
Hi,

I did some tests with cvs head from today and found out that it is quite
slow when messages are inserted. I have a really aged machine for tests,
but even for that machine it is too slow:

Pentium 166, 128 MB RAM
Debian Sarge
Vanilla kernel 2.6.1
all services stopped that aren't required
IMAP protected using stunnel
Postgres 7.3.4 on a fast 7200 rpm 80 GB HD

Copying about 3000 mails using IMAP took more than an hour (!) in my
LAN.

I've enabled postgres stats collector and did some tests: I have set up
a new database with one user, and two mailboxes, INBOX and INBOX/Test
and renamed dbmail-smtp, so my MTA couldn't insert any messages.

I used the following SQL queries to get the stats:

select xact_commit from pg_stat_database where datname='dbmail'
To get the overall number of transactions.

select * from pg_stat_user_tables
To get detailed seq scans/INSERT/UPDATE/DELETE counts for every row.

select pg_stat_reset()
To reset all stats to zero.

I've reset the stats, started Evolution 1.4.5 and copied 2872 small
mails (sometimes spam can be useful) to my dbmail test machine.
Unfortunately Evolution read about 2% of the mails after I've copied
them, that's why some seq_scan numbers might be somewhat to high.
I've attached the table because of the line breaks too.

--------------------------------------------------
xact_commit
-------------
66906
(1 row)

relid | schemaname | relname | seq_scan | seq_tup_read |
idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
-------+------------+----------------------+----------+--------------+----------+---------------+-----------+-----------+-----------
83072 | public | aliases | 0 | 0 |
0 | 0 | 0 | 0 | 0
83082 | public | users | 0 | 0 |
11490 | 11490 | 0 | 5745 | 0
83095 | public | mailboxes | 20295 | 98603 |
0 | 0 | 0 | 0 | 0
83116 | public | subscription | 0 | 0 |
0 | 0 | 0 | 0 | 0
83128 | public | acl | 0 | 0 |
0 | 0 | 0 | 0 | 0
83151 | public | physmessage | 17490 | 25494744 |
0 | 0 | 2872 | 2958 | 0
83160 | public | messages | 23436 | 34300164 |
0 | 0 | 2872 | 2872 | 0
83189 | public | messageblks | 86 | 493984 |
0 | 0 | 5744 | 0 | 0
83206 | public | auto_notifications | 0 | 0 |
0 | 0 | 0 | 0 | 0
83219 | public | auto_replies | 0 | 0 |
0 | 0 | 0 | 0 | 0
[..]
(15 rows)

pg_stat_reset
---------------
t
(1 row)
--------------------------------------------------

There are several problems:
- 23 transactions for every message copied (!)
- the users table is updated twice for every message
- lot of sequential scans

We had that discussion some months ago yet - we really, really need
transactions. I didn't check the code, but is it possible to encapsulate
the database access for one mail into one transaction without major
rewrites? That would speed up a lot.


Then I checked the performance of dbmail-maintenance (cfpd).

There is one seqScan on mailboxes for every message even if there is
nothing to do - several thousand mailboxes and some million mails and
that will take a really long time.

I've marked all messages as deleted and ran maintenance the first time
(set delete status for deleted messages) - 2872 updates on messages but
only 31 transactions, that looks very good.
The second run of maintenance (delete messages with delete status)
looked worse: 2872 delete on physmessage and messages, 5744 delete on
messageblks but 14389 transactions.


It should be easy to add an index for mailboxes to remove the sequential
scans, but I don't know what query forces postgres to do the seqScan.

I checked the code and found this query for example:
snprintf(query, DEF_QUERYSIZE,
"SELECT mailbox_idnr FROM mailboxes "
"WHERE LOWER(name) = LOWER('%s') "
"AND owner_idnr='%llu'", name, owner_idnr);
We have this index:
CREATE INDEX mailboxes_name_idx ON mailboxes(name);
But we would need:
CREATE INDEX mailboxes_lower_name_idx ON mailboxes(lower(name));

Is IMAP case sensitive? Are folders like INBOX/test and INBOX/Test
possible? If it's insensitive we could lower() all queries on names and
create an index on lower(name).


--
MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
Re: some speed tests [ In reply to ]
Do you have any way of narrowing this down to specific queries that are taking
the longest and/or are being executed the most? That would identify which
low-level database functions are being called, then we can just trace our way
up the call chain to see who's misbehaving or acting on a flawed design. Also,
if you could run similar tests against the latest 1.2, it would help to give a
frame of reference, particularly for my delivery chain design.

Aaron


Thomas Mueller <dbmail@tmueller.com> said:

> Hi,
>
> I did some tests with cvs head from today and found out that it is quite
> slow when messages are inserted. I have a really aged machine for tests,
> but even for that machine it is too slow:
>
> Pentium 166, 128 MB RAM
> Debian Sarge
> Vanilla kernel 2.6.1
> all services stopped that aren't required
> IMAP protected using stunnel
> Postgres 7.3.4 on a fast 7200 rpm 80 GB HD
>
> Copying about 3000 mails using IMAP took more than an hour (!) in my
> LAN.
>
> I've enabled postgres stats collector and did some tests: I have set up
> a new database with one user, and two mailboxes, INBOX and INBOX/Test
> and renamed dbmail-smtp, so my MTA couldn't insert any messages.
>
> I used the following SQL queries to get the stats:
>
> select xact_commit from pg_stat_database where datname='dbmail'
> To get the overall number of transactions.
>
> select * from pg_stat_user_tables
> To get detailed seq scans/INSERT/UPDATE/DELETE counts for every row.
>
> select pg_stat_reset()
> To reset all stats to zero.
>
> I've reset the stats, started Evolution 1.4.5 and copied 2872 small
> mails (sometimes spam can be useful) to my dbmail test machine.
> Unfortunately Evolution read about 2% of the mails after I've copied
> them, that's why some seq_scan numbers might be somewhat to high.
> I've attached the table because of the line breaks too.
>
> --------------------------------------------------
> xact_commit
> -------------
> 66906
> (1 row)
>
> relid | schemaname | relname | seq_scan | seq_tup_read |
> idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del
>
-------+------------+----------------------+----------+--------------+----------+---------------+-----------+-----------+-----------
> 83072 | public | aliases | 0 | 0 |
> 0 | 0 | 0 | 0 | 0
> 83082 | public | users | 0 | 0 |
> 11490 | 11490 | 0 | 5745 | 0
> 83095 | public | mailboxes | 20295 | 98603 |
> 0 | 0 | 0 | 0 | 0
> 83116 | public | subscription | 0 | 0 |
> 0 | 0 | 0 | 0 | 0
> 83128 | public | acl | 0 | 0 |
> 0 | 0 | 0 | 0 | 0
> 83151 | public | physmessage | 17490 | 25494744 |
> 0 | 0 | 2872 | 2958 | 0
> 83160 | public | messages | 23436 | 34300164 |
> 0 | 0 | 2872 | 2872 | 0
> 83189 | public | messageblks | 86 | 493984 |
> 0 | 0 | 5744 | 0 | 0
> 83206 | public | auto_notifications | 0 | 0 |
> 0 | 0 | 0 | 0 | 0
> 83219 | public | auto_replies | 0 | 0 |
> 0 | 0 | 0 | 0 | 0
> [..]
> (15 rows)
>
> pg_stat_reset
> ---------------
> t
> (1 row)
> --------------------------------------------------
>
> There are several problems:
> - 23 transactions for every message copied (!)
> - the users table is updated twice for every message
> - lot of sequential scans
>
> We had that discussion some months ago yet - we really, really need
> transactions. I didn't check the code, but is it possible to encapsulate
> the database access for one mail into one transaction without major
> rewrites? That would speed up a lot.
>
>
> Then I checked the performance of dbmail-maintenance (cfpd).
>
> There is one seqScan on mailboxes for every message even if there is
> nothing to do - several thousand mailboxes and some million mails and
> that will take a really long time.
>
> I've marked all messages as deleted and ran maintenance the first time
> (set delete status for deleted messages) - 2872 updates on messages but
> only 31 transactions, that looks very good.
> The second run of maintenance (delete messages with delete status)
> looked worse: 2872 delete on physmessage and messages, 5744 delete on
> messageblks but 14389 transactions.
>
>
> It should be easy to add an index for mailboxes to remove the sequential
> scans, but I don't know what query forces postgres to do the seqScan.
>
> I checked the code and found this query for example:
> snprintf(query, DEF_QUERYSIZE,
> "SELECT mailbox_idnr FROM mailboxes "
> "WHERE LOWER(name) = LOWER('%s') "
> "AND owner_idnr='%llu'", name, owner_idnr);
> We have this index:
> CREATE INDEX mailboxes_name_idx ON mailboxes(name);
> But we would need:
> CREATE INDEX mailboxes_lower_name_idx ON mailboxes(lower(name));
>
> Is IMAP case sensitive? Are folders like INBOX/test and INBOX/Test
> possible? If it's insensitive we could lower() all queries on names and
> create an index on lower(name).
>
>
> --
> MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
>
>



--
Re: some speed tests [ In reply to ]
Hi Aaron,

> Do you have any way of narrowing this down to specific queries that are taking
> the longest and/or are being executed the most? That would identify which
> low-level database functions are being called, then we can just trace our way
> up the call chain to see who's misbehaving or acting on a flawed design. Also,
> if you could run similar tests against the latest 1.2, it would help to give a
> frame of reference, particularly for my delivery chain design.

That's simple: the main design flaw (actually that's no design flaw I
think, that's because the so called database MySQL couldn't do
transactions in the past) is that dbmail doesn't use transaction.

Because of that AutoCommit is used and whenever dbmail does anySqlQuery
Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is terribly
slow. To ensure the Durability in ACID the database has to fflush()
every transaction to stable storage!
That's why there is only one solution: we have to use transaction.

With transactions we could remove the integrity checks of
dbmail-maintenance too, because the database guarantees integrity.

Anyway, I did a trace of all SQL queries when a mail is copied using
IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
one mail?).

When searching for the sequential scan I found something quite
interesting in the docs: the planer decides for every scan if a seqScan
is cheaper that an index scan, and does a seqScan even if an index
exists:

dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE owner_idnr=2;
QUERY PLAN
---------------------------------------------------------
Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
Filter: (owner_idnr = 2)
(2 rows)

The table has an index on owner_idnr.

So I should repeat this test with a database with several hundred to
thousand user, several dozen mailboxes for each user and several dozen
mails in each mailbox to find out if all required indizes are there.
Did anyone write a script to create such a database?

But I found a strange query:
SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
owner_idnr = '2'
mailbox_idnr is the primary key so that could be optimized to:
SELECT 4
;-)


I don't have a 1.2 installation, I'm sorry.


--
MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
RE: some speed tests [ In reply to ]
Try: explain SELECT mailbox_idnr FROM mailboxes WHERE
owner_idnr=2::bigint;

-----Original Message-----
From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
On Behalf Of Thomas Mueller
Sent: Sunday, February 15, 2004 8:11 AM
To: dbmail-dev@dbmail.org
Subject: Re: [Dbmail-dev] some speed tests


Hi Aaron,

> Do you have any way of narrowing this down to specific queries that
> are taking the longest and/or are being executed the most? That would
> identify which low-level database functions are being called, then we
> can just trace our way up the call chain to see who's misbehaving or
> acting on a flawed design. Also, if you could run similar tests
> against the latest 1.2, it would help to give a frame of reference,
> particularly for my delivery chain design.

That's simple: the main design flaw (actually that's no design flaw I
think, that's because the so called database MySQL couldn't do
transactions in the past) is that dbmail doesn't use transaction.

Because of that AutoCommit is used and whenever dbmail does anySqlQuery
Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is terribly slow.
To ensure the Durability in ACID the database has to fflush() every
transaction to stable storage! That's why there is only one solution: we
have to use transaction.

With transactions we could remove the integrity checks of
dbmail-maintenance too, because the database guarantees integrity.

Anyway, I did a trace of all SQL queries when a mail is copied using
IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
one mail?).

When searching for the sequential scan I found something quite
interesting in the docs: the planer decides for every scan if a seqScan
is cheaper that an index scan, and does a seqScan even if an index
exists:

dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE owner_idnr=2;
QUERY PLAN
---------------------------------------------------------
Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
Filter: (owner_idnr = 2)
(2 rows)

The table has an index on owner_idnr.

So I should repeat this test with a database with several hundred to
thousand user, several dozen mailboxes for each user and several dozen
mails in each mailbox to find out if all required indizes are there. Did
anyone write a script to create such a database?

But I found a strange query:
SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
owner_idnr = '2' mailbox_idnr is the primary key so that could be
optimized to: SELECT 4
;-)


I don't have a 1.2 installation, I'm sorry.


--
MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
Re: some speed tests [ In reply to ]
Hi John,

> Try: explain SELECT mailbox_idnr FROM mailboxes WHERE
> owner_idnr=2::bigint;

No difference:

dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
owner_idnr=2::bigint;
QUERY PLAN
---------------------------------------------------------
Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
Filter: (owner_idnr = 2::bigint)

> dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE owner_idnr=2;
> QUERY PLAN
> ---------------------------------------------------------
> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> Filter: (owner_idnr = 2)
> (2 rows)
>
> The table has an index on owner_idnr.


--
MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
RE: some speed tests [ In reply to ]
Try altering the stats target for the table, and run vacuum analyze.

Then try your query again.

-----Original Message-----
From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
On Behalf Of Thomas Mueller
Sent: Sunday, February 15, 2004 9:26 AM
To: dbmail-dev@dbmail.org
Subject: Re: [Dbmail-dev] some speed tests


Hi John,

> Try: explain SELECT mailbox_idnr FROM mailboxes WHERE
> owner_idnr=2::bigint;

No difference:

dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
owner_idnr=2::bigint;
QUERY PLAN
---------------------------------------------------------
Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
Filter: (owner_idnr = 2::bigint)

> dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
owner_idnr=2;
> QUERY PLAN
> ---------------------------------------------------------
> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> Filter: (owner_idnr = 2)
> (2 rows)
>
> The table has an index on owner_idnr.


--
MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
Re: some speed tests [ In reply to ]
I don't even know where to begin in terms of designing the delivery chain
around transactions. Could we do it as simply as adding functions...

void db_begin_transaction(void);
void db_flush_transaction(void); (or db_commit_...?)

and then calling these functions before and after each major section of
database code? For the delivery chain, we could do it inside of
insert_message(). For dbmail-smtp, this basically means that the execution of
the whole program is within one transaction. For dbmail-lmtpd, it means that
each message is delivered within a transaction but the miscellaneous queries
before the main message delivery chain are not transacted. For MySQL, these
functions would be noops.

Thing that might work?

Aaron


Thomas Mueller <dbmail@tmueller.com> said:

> Hi Aaron,
>
> > Do you have any way of narrowing this down to specific queries that are
> > taking the longest and/or are being executed the most? That would
> > identify which low-level database functions are being called, then we
> > can just trace our way up the call chain to see who's misbehaving or
> > acting on a flawed design. Also, if you could run similar tests against
> > the latest 1.2, it would help to give a frame of reference, particularly
> > for my delivery chain design.
>
> That's simple: the main design flaw (actually that's no design flaw I
> think, that's because the so called database MySQL couldn't do
> transactions in the past) is that dbmail doesn't use transaction.
>
> Because of that AutoCommit is used and whenever dbmail does anySqlQuery
> Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is terribly
> slow. To ensure the Durability in ACID the database has to fflush()
> every transaction to stable storage!
> That's why there is only one solution: we have to use transaction.
>
> With transactions we could remove the integrity checks of
> dbmail-maintenance too, because the database guarantees integrity.
>
> Anyway, I did a trace of all SQL queries when a mail is copied using
> IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
> one mail?).
>
> When searching for the sequential scan I found something quite
> interesting in the docs: the planer decides for every scan if a seqScan
> is cheaper that an index scan, and does a seqScan even if an index
> exists:
>
> dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE owner_idnr=2;
> QUERY PLAN
> ---------------------------------------------------------
> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> Filter: (owner_idnr = 2)
> (2 rows)
>
> The table has an index on owner_idnr.
>
> So I should repeat this test with a database with several hundred to
> thousand user, several dozen mailboxes for each user and several dozen
> mails in each mailbox to find out if all required indizes are there.
> Did anyone write a script to create such a database?
>
> But I found a strange query:
> SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> owner_idnr = '2'
> mailbox_idnr is the primary key so that could be optimized to:
> SELECT 4
> ;-)
>
>
> I don't have a 1.2 installation, I'm sorry.
>
>
> --
> MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>



--
Re: some speed tests [ In reply to ]
Forgive the bluntness of the statement, but why is anyone even worrying
about transactions as they relate to MySQL???

COMMIT in MySQL is passed to the table handler. In the case of MyISAM
tables, the handler disregards the statement. For InnoDB and BDB tables,
COMMIT acts as it does in PostgreSQL.

So various people in this thread implying that MySQL isn't really a
database need to do some more reading.

In summary, just encapsulate everything in transaction blocks and the
underlaying database will act appropriately.

Regards,

Chris

Aaron Stone wrote:

>I don't even know where to begin in terms of designing the delivery chain
>around transactions. Could we do it as simply as adding functions...
>
> void db_begin_transaction(void);
> void db_flush_transaction(void); (or db_commit_...?)
>
>and then calling these functions before and after each major section of
>database code? For the delivery chain, we could do it inside of
>insert_message(). For dbmail-smtp, this basically means that the execution of
>the whole program is within one transaction. For dbmail-lmtpd, it means that
>each message is delivered within a transaction but the miscellaneous queries
>before the main message delivery chain are not transacted. For MySQL, these
>functions would be noops.
>
>Thing that might work?
>
>Aaron
>
>
>Thomas Mueller <dbmail@tmueller.com> said:
>
>
>
>>Hi Aaron,
>>
>>
>>
>>>Do you have any way of narrowing this down to specific queries that are
>>>taking the longest and/or are being executed the most? That would
>>>identify which low-level database functions are being called, then we
>>>can just trace our way up the call chain to see who's misbehaving or
>>>acting on a flawed design. Also, if you could run similar tests against
>>>the latest 1.2, it would help to give a frame of reference, particularly
>>>for my delivery chain design.
>>>
>>>
>>That's simple: the main design flaw (actually that's no design flaw I
>>think, that's because the so called database MySQL couldn't do
>>transactions in the past) is that dbmail doesn't use transaction.
>>
>>Because of that AutoCommit is used and whenever dbmail does anySqlQuery
>>Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is terribly
>>slow. To ensure the Durability in ACID the database has to fflush()
>>every transaction to stable storage!
>>That's why there is only one solution: we have to use transaction.
>>
>>With transactions we could remove the integrity checks of
>>dbmail-maintenance too, because the database guarantees integrity.
>>
>>Anyway, I did a trace of all SQL queries when a mail is copied using
>>IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
>>one mail?).
>>
>>When searching for the sequential scan I found something quite
>>interesting in the docs: the planer decides for every scan if a seqScan
>>is cheaper that an index scan, and does a seqScan even if an index
>>exists:
>>
>>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE owner_idnr=2;
>> QUERY PLAN
>>---------------------------------------------------------
>> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
>> Filter: (owner_idnr = 2)
>>(2 rows)
>>
>>The table has an index on owner_idnr.
>>
>>So I should repeat this test with a database with several hundred to
>>thousand user, several dozen mailboxes for each user and several dozen
>>mails in each mailbox to find out if all required indizes are there.
>>Did anyone write a script to create such a database?
>>
>>But I found a strange query:
>>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
>>owner_idnr = '2'
>>mailbox_idnr is the primary key so that could be optimized to:
>>SELECT 4
>>;-)
>>
>>
>>I don't have a 1.2 installation, I'm sorry.
>>
>>
>>--
>>MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
>>_______________________________________________
>>Dbmail-dev mailing list
>>Dbmail-dev@dbmail.org
>>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>>
>>
>>
>
>
>
>
>
Re: some speed tests [ In reply to ]
Comments inline...

Chris Nolan <chris@imi.com.au> said:

> Forgive the bluntness of the statement, but why is anyone even worrying
> about transactions as they relate to MySQL???

Because we are currently structured to have a single set of mid-level database
operations that are translated into specific low-level database function
calls. The upside is that there isn't any database specific handling outside
of these mid-level functions. The downside is that we need to make sure that
function calls for certain features are consistent with analogous features in
each database and that they carry with them enough information to make the
appropriate low-level call. A good example is the last inserted id number. In
MySQL, you only need the database connection identifier to get this. In
PostgreSQL, yo need both the database connection and the table identifier. For
DBMail to have a mid-level function that worked for both, we'd have to make
sure that it took both arguments and used them as needed for whichever
database's low-level calls were being used. (Those were off the top of my
head, so they may be incorrect, but they do illustrate my point.)

> COMMIT in MySQL is passed to the table handler. In the case of MyISAM
> tables, the handler disregards the statement. For InnoDB and BDB tables,
> COMMIT acts as it does in PostgreSQL.

So then we have THREE different configurations to consider, and need to be
sure to design the mid-level interface appropriately.

> So various people in this thread implying that MySQL isn't really a
> database need to do some more reading.

If you're referring to my suggestion that the transaction functions are a noop
for MySQL, then you're reading too deeply... I just didn't realize that InnoDB
would handle transactions entirely normally.

> In summary, just encapsulate everything in transaction blocks and the
> underlaying database will act appropriately.

Right. That's what we're talking about.

Is the entire point of your comment that we can safely use

BEGIN;

.. whatever dbmail does...

COMMIT;

regardless of whether or not we know if the host database actually supports
these keywords? You could have just said that.


> Regards,
>
> Chris
>
> Aaron Stone wrote:
>
> >I don't even know where to begin in terms of designing the delivery chain
> >around transactions. Could we do it as simply as adding functions...
> >
> > void db_begin_transaction(void);
> > void db_flush_transaction(void); (or db_commit_...?)
> >
> >and then calling these functions before and after each major section of
> >database code? For the delivery chain, we could do it inside of
> >insert_message(). For dbmail-smtp, this basically means that the execution of
> >the whole program is within one transaction. For dbmail-lmtpd, it means that
> >each message is delivered within a transaction but the miscellaneous queries
> >before the main message delivery chain are not transacted. For MySQL, these
> >functions would be noops.
> >
> >Thing that might work?
> >
> >Aaron
> >
> >
> >Thomas Mueller <dbmail@tmueller.com> said:
> >
> >
> >
> >>Hi Aaron,
> >>
> >>
> >>
> >>>Do you have any way of narrowing this down to specific queries that are
> >>>taking the longest and/or are being executed the most? That would
> >>>identify which low-level database functions are being called, then we
> >>>can just trace our way up the call chain to see who's misbehaving or
> >>>acting on a flawed design. Also, if you could run similar tests against
> >>>the latest 1.2, it would help to give a frame of reference, particularly
> >>>for my delivery chain design.
> >>>
> >>>
> >>That's simple: the main design flaw (actually that's no design flaw I
> >>think, that's because the so called database MySQL couldn't do
> >>transactions in the past) is that dbmail doesn't use transaction.
> >>
> >>Because of that AutoCommit is used and whenever dbmail does anySqlQuery
> >>Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is terribly
> >>slow. To ensure the Durability in ACID the database has to fflush()
> >>every transaction to stable storage!
> >>That's why there is only one solution: we have to use transaction.
> >>
> >>With transactions we could remove the integrity checks of
> >>dbmail-maintenance too, because the database guarantees integrity.
> >>
> >>Anyway, I did a trace of all SQL queries when a mail is copied using
> >>IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
> >>one mail?).
> >>
> >>When searching for the sequential scan I found something quite
> >>interesting in the docs: the planer decides for every scan if a seqScan
> >>is cheaper that an index scan, and does a seqScan even if an index
> >>exists:
> >>
> >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE owner_idnr=2;
> >> QUERY PLAN
> >>---------------------------------------------------------
> >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> >> Filter: (owner_idnr = 2)
> >>(2 rows)
> >>
> >>The table has an index on owner_idnr.
> >>
> >>So I should repeat this test with a database with several hundred to
> >>thousand user, several dozen mailboxes for each user and several dozen
> >>mails in each mailbox to find out if all required indizes are there.
> >>Did anyone write a script to create such a database?
> >>
> >>But I found a strange query:
> >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> >>owner_idnr = '2'
> >>mailbox_idnr is the primary key so that could be optimized to:
> >>SELECT 4
> >>;-)
> >>
> >>
> >>I don't have a 1.2 installation, I'm sorry.
> >>
> >>
> >>--
> >>MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
> >>_______________________________________________
> >>Dbmail-dev mailing list
> >>Dbmail-dev@dbmail.org
> >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >>
> >>
> >>
> >
> >
> >
> >
> >
>
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>



--
Re: some speed tests [ In reply to ]
Dear Aaron,

My comments are also inline. :-)

On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> Comments inline...
>
> Chris Nolan <chris@imi.com.au> said:
>
> > Forgive the bluntness of the statement, but why is anyone even worrying
> > about transactions as they relate to MySQL???
>
> Because we are currently structured to have a single set of mid-level database
> operations that are translated into specific low-level database function
> calls. The upside is that there isn't any database specific handling outside
> of these mid-level functions. The downside is that we need to make sure that
> function calls for certain features are consistent with analogous features in
> each database and that they carry with them enough information to make the
> appropriate low-level call. A good example is the last inserted id number. In
> MySQL, you only need the database connection identifier to get this. In
> PostgreSQL, yo need both the database connection and the table identifier. For
> DBMail to have a mid-level function that worked for both, we'd have to make
> sure that it took both arguments and used them as needed for whichever
> database's low-level calls were being used. (Those were off the top of my
> head, so they may be incorrect, but they do illustrate my point.)

I've looked through the DBMail source code on a few occassions and even
released a dodgy tool to convert Cyrus mailboxes to DBMail a while ago.
The fact that I can use the calls present in the DB modules and avoid
having to write queries by hand for each different DB is an excellent
feature of the DBMail API. I have nothing negative to say here at all.

>
> > COMMIT in MySQL is passed to the table handler. In the case of MyISAM
> > tables, the handler disregards the statement. For InnoDB and BDB tables,
> > COMMIT acts as it does in PostgreSQL.
>
> So then we have THREE different configurations to consider, and need to be
> sure to design the mid-level interface appropriately.

My statement was meant to say that the MyISAM table handler will just
disregard BEGIN and COMMIT statements, just as it parses but ignores
CHECK and FOREIGN KEY constraints in table creation, thus illustrating
that you wouldn't need to add a mysql-with-transactions directory to the
source tree.

>
> > So various people in this thread implying that MySQL isn't really a
> > database need to do some more reading.
>
> If you're referring to my suggestion that the transaction functions are a noop
> for MySQL, then you're reading too deeply... I just didn't realize that InnoDB
> would handle transactions entirely normally.

My apologies, I was not referring to you at all! Yourself and your
collegues have provided the world with a very, very funky mail
repository!

I was referring to someone who said in the body of their message posted
to another branch of this thread "the supposed database mysql".
Considering MySQL has proven itself time and time again in terms of
reliability, licencing flexibility and performance, such statements are
baseless regardless of what arguments you make for which features. For
example, Visual FoxPro's backend is technically a database but provides
nothing in the way of a privilege system (you need write access to
record locking, so filesystem-level controls are no good either).

>
> > In summary, just encapsulate everything in transaction blocks and the
> > underlaying database will act appropriately.
>
> Right. That's what we're talking about.
>
> Is the entire point of your comment that we can safely use
>
> BEGIN;
>
> .. whatever dbmail does...
>
> COMMIT;
>
> regardless of whether or not we know if the host database actually supports
> these keywords? You could have just said that.

I could have, but then I would have risked putting an end to the thread.
:-)

Please accept my apology - I never meant to offend yourself or anyone
else who contributes to DBMail. It seems my reading of the post I
mentioned above caused me to be a bit heavier than I should have been.

But yes, you have perfectly summarised the point of my comment.
>
> > Regards,
> >
> > Chris
> >
> > Aaron Stone wrote:
> >
> > >I don't even know where to begin in terms of designing the delivery chain
> > >around transactions. Could we do it as simply as adding functions...
> > >
> > > void db_begin_transaction(void);
> > > void db_flush_transaction(void); (or db_commit_...?)
> > >
> > >and then calling these functions before and after each major section of
> > >database code? For the delivery chain, we could do it inside of
> > >insert_message(). For dbmail-smtp, this basically means that the execution of
> > >the whole program is within one transaction. For dbmail-lmtpd, it means that
> > >each message is delivered within a transaction but the miscellaneous queries
> > >before the main message delivery chain are not transacted. For MySQL, these
> > >functions would be noops.
> > >
> > >Thing that might work?
> > >
> > >Aaron
> > >
> > >
> > >Thomas Mueller <dbmail@tmueller.com> said:
> > >
> > >
> > >
> > >>Hi Aaron,
> > >>
> > >>
> > >>
> > >>>Do you have any way of narrowing this down to specific queries that are
> > >>>taking the longest and/or are being executed the most? That would
> > >>>identify which low-level database functions are being called, then we
> > >>>can just trace our way up the call chain to see who's misbehaving or
> > >>>acting on a flawed design. Also, if you could run similar tests against
> > >>>the latest 1.2, it would help to give a frame of reference, particularly
> > >>>for my delivery chain design.
> > >>>
> > >>>
> > >>That's simple: the main design flaw (actually that's no design flaw I
> > >>think, that's because the so called database MySQL couldn't do
> > >>transactions in the past) is that dbmail doesn't use transaction.
> > >>
> > >>Because of that AutoCommit is used and whenever dbmail does anySqlQuery
> > >>Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is terribly
> > >>slow. To ensure the Durability in ACID the database has to fflush()
> > >>every transaction to stable storage!
> > >>That's why there is only one solution: we have to use transaction.
> > >>
> > >>With transactions we could remove the integrity checks of
> > >>dbmail-maintenance too, because the database guarantees integrity.
> > >>
> > >>Anyway, I did a trace of all SQL queries when a mail is copied using
> > >>IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
> > >>one mail?).
> > >>
> > >>When searching for the sequential scan I found something quite
> > >>interesting in the docs: the planer decides for every scan if a seqScan
> > >>is cheaper that an index scan, and does a seqScan even if an index
> > >>exists:
> > >>
> > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE owner_idnr=2;
> > >> QUERY PLAN
> > >>---------------------------------------------------------
> > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > >> Filter: (owner_idnr = 2)
> > >>(2 rows)
> > >>
> > >>The table has an index on owner_idnr.
> > >>
> > >>So I should repeat this test with a database with several hundred to
> > >>thousand user, several dozen mailboxes for each user and several dozen
> > >>mails in each mailbox to find out if all required indizes are there.
> > >>Did anyone write a script to create such a database?
> > >>
> > >>But I found a strange query:
> > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> > >>owner_idnr = '2'
> > >>mailbox_idnr is the primary key so that could be optimized to:
> > >>SELECT 4
> > >>;-)
> > >>
> > >>
> > >>I don't have a 1.2 installation, I'm sorry.
> > >>
> > >>
> > >>--
> > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
> > >>_______________________________________________
> > >>Dbmail-dev mailing list
> > >>Dbmail-dev@dbmail.org
> > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > >>
> > >>
> > >>
> > >
> > >
> > >
> > >
> > >
> >
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >
>
>
RE: some speed tests [ In reply to ]
I assume here that you talk about MySQL 4.
MySQL 3 does not have transactions.

-----Original Message-----
From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
On Behalf Of Chris Nolan
Sent: Sunday, February 15, 2004 11:57 AM
To: dbmail-dev@dbmail.org
Subject: Re: [Dbmail-dev] some speed tests


Forgive the bluntness of the statement, but why is anyone even worrying
about transactions as they relate to MySQL???

COMMIT in MySQL is passed to the table handler. In the case of MyISAM
tables, the handler disregards the statement. For InnoDB and BDB tables,

COMMIT acts as it does in PostgreSQL.

So various people in this thread implying that MySQL isn't really a
database need to do some more reading.

In summary, just encapsulate everything in transaction blocks and the
underlaying database will act appropriately.

Regards,

Chris

Aaron Stone wrote:

>I don't even know where to begin in terms of designing the delivery
>chain around transactions. Could we do it as simply as adding
>functions...
>
> void db_begin_transaction(void);
> void db_flush_transaction(void); (or db_commit_...?)
>
>and then calling these functions before and after each major section of

>database code? For the delivery chain, we could do it inside of
>insert_message(). For dbmail-smtp, this basically means that the
>execution of the whole program is within one transaction. For
>dbmail-lmtpd, it means that each message is delivered within a
>transaction but the miscellaneous queries before the main message
>delivery chain are not transacted. For MySQL, these functions would be
>noops.
>
>Thing that might work?
>
>Aaron
>
>
>Thomas Mueller <dbmail@tmueller.com> said:
>
>
>
>>Hi Aaron,
>>
>>
>>
>>>Do you have any way of narrowing this down to specific queries that
>>>are taking the longest and/or are being executed the most? That would

>>>identify which low-level database functions are being called, then we

>>>can just trace our way up the call chain to see who's misbehaving or
>>>acting on a flawed design. Also, if you could run similar tests
>>>against the latest 1.2, it would help to give a frame of reference,
>>>particularly for my delivery chain design.
>>>
>>>
>>That's simple: the main design flaw (actually that's no design flaw I
>>think, that's because the so called database MySQL couldn't do
>>transactions in the past) is that dbmail doesn't use transaction.
>>
>>Because of that AutoCommit is used and whenever dbmail does
>>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is
>>terribly slow. To ensure the Durability in ACID the database has to
>>fflush() every transaction to stable storage! That's why there is only

>>one solution: we have to use transaction.
>>
>>With transactions we could remove the integrity checks of
>>dbmail-maintenance too, because the database guarantees integrity.
>>
>>Anyway, I did a trace of all SQL queries when a mail is copied using
>>IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
>>one mail?).
>>
>>When searching for the sequential scan I found something quite
>>interesting in the docs: the planer decides for every scan if a
>>seqScan is cheaper that an index scan, and does a seqScan even if an
>>index
>>exists:
>>
>>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
owner_idnr=2;
>> QUERY PLAN
>>---------------------------------------------------------
>> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
>> Filter: (owner_idnr = 2)
>>(2 rows)
>>
>>The table has an index on owner_idnr.
>>
>>So I should repeat this test with a database with several hundred to
>>thousand user, several dozen mailboxes for each user and several dozen

>>mails in each mailbox to find out if all required indizes are there.
>>Did anyone write a script to create such a database?
>>
>>But I found a strange query:
>>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
>>owner_idnr = '2' mailbox_idnr is the primary key so that could be
>>optimized to: SELECT 4
>>;-)
>>
>>
>>I don't have a 1.2 installation, I'm sorry.
>>
>>
>>--
>>MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
>>_______________________________________________
>>Dbmail-dev mailing list
>>Dbmail-dev@dbmail.org
>>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>>
>>
>>
>
>
>
>
>


_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
RE: some speed tests [ In reply to ]
Personally, I recent the statement; 'Considering MySQL has proven itself
time and time again in terms of reliability, licencing flexibility and
performance, such statements are baseless regardless of what arguments
you make for which features'; based on experience, mysql proved that it
was not reliable, tho performance was top of the line.

In my opinion, if you're concerned with reliability, mysql should not be
an option.
I had daily table corruptions, and thus one of my main tasks was to run
mysql repair jobs regularly, just to keep everything running.

Granted, postgresql has it's own problems, mainly with performance and
lack of a proper master / slave replication model, but that's a small
price to pay for reliability.

Just my $0.02 worth.

Regards,

John Hansen

-----Original Message-----
From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
On Behalf Of Chris Nolan
Sent: Sunday, February 15, 2004 4:42 PM
To: dbmail-dev@dbmail.org
Subject: Re: [Dbmail-dev] some speed tests


Dear Aaron,

My comments are also inline. :-)

On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> Comments inline...
>
> Chris Nolan <chris@imi.com.au> said:
>
> > Forgive the bluntness of the statement, but why is anyone even
> > worrying
> > about transactions as they relate to MySQL???
>
> Because we are currently structured to have a single set of mid-level
> database operations that are translated into specific low-level
> database function calls. The upside is that there isn't any database
> specific handling outside of these mid-level functions. The downside
> is that we need to make sure that function calls for certain features
> are consistent with analogous features in each database and that they
> carry with them enough information to make the appropriate low-level
> call. A good example is the last inserted id number. In MySQL, you
> only need the database connection identifier to get this. In
> PostgreSQL, yo need both the database connection and the table
> identifier. For DBMail to have a mid-level function that worked for
> both, we'd have to make sure that it took both arguments and used them

> as needed for whichever database's low-level calls were being used.
> (Those were off the top of my head, so they may be incorrect, but they

> do illustrate my point.)

I've looked through the DBMail source code on a few occassions and even
released a dodgy tool to convert Cyrus mailboxes to DBMail a while ago.
The fact that I can use the calls present in the DB modules and avoid
having to write queries by hand for each different DB is an excellent
feature of the DBMail API. I have nothing negative to say here at all.

>
> > COMMIT in MySQL is passed to the table handler. In the case of
> > MyISAM
> > tables, the handler disregards the statement. For InnoDB and BDB
tables,
> > COMMIT acts as it does in PostgreSQL.
>
> So then we have THREE different configurations to consider, and need
> to be sure to design the mid-level interface appropriately.

My statement was meant to say that the MyISAM table handler will just
disregard BEGIN and COMMIT statements, just as it parses but ignores
CHECK and FOREIGN KEY constraints in table creation, thus illustrating
that you wouldn't need to add a mysql-with-transactions directory to the
source tree.

>
> > So various people in this thread implying that MySQL isn't really a
> > database need to do some more reading.
>
> If you're referring to my suggestion that the transaction functions
> are a noop for MySQL, then you're reading too deeply... I just didn't
> realize that InnoDB would handle transactions entirely normally.

My apologies, I was not referring to you at all! Yourself and your
collegues have provided the world with a very, very funky mail
repository!

I was referring to someone who said in the body of their message posted
to another branch of this thread "the supposed database mysql".
Considering MySQL has proven itself time and time again in terms of
reliability, licencing flexibility and performance, such statements are
baseless regardless of what arguments you make for which features. For
example, Visual FoxPro's backend is technically a database but provides
nothing in the way of a privilege system (you need write access to
record locking, so filesystem-level controls are no good either).

>
> > In summary, just encapsulate everything in transaction blocks and
> > the
> > underlaying database will act appropriately.
>
> Right. That's what we're talking about.
>
> Is the entire point of your comment that we can safely use
>
> BEGIN;
>
> .. whatever dbmail does...
>
> COMMIT;
>
> regardless of whether or not we know if the host database actually
> supports these keywords? You could have just said that.

I could have, but then I would have risked putting an end to the thread.
:-)

Please accept my apology - I never meant to offend yourself or anyone
else who contributes to DBMail. It seems my reading of the post I
mentioned above caused me to be a bit heavier than I should have been.

But yes, you have perfectly summarised the point of my comment.
>
> > Regards,
> >
> > Chris
> >
> > Aaron Stone wrote:
> >
> > >I don't even know where to begin in terms of designing the delivery

> > >chain around transactions. Could we do it as simply as adding
> > >functions...
> > >
> > > void db_begin_transaction(void);
> > > void db_flush_transaction(void); (or db_commit_...?)
> > >
> > >and then calling these functions before and after each major
> > >section of database code? For the delivery chain, we could do it
> > >inside of insert_message(). For dbmail-smtp, this basically means
> > >that the execution of the whole program is within one transaction.
> > >For dbmail-lmtpd, it means that each message is delivered within a
> > >transaction but the miscellaneous queries before the main message
> > >delivery chain are not transacted. For MySQL, these functions would

> > >be noops.
> > >
> > >Thing that might work?
> > >
> > >Aaron
> > >
> > >
> > >Thomas Mueller <dbmail@tmueller.com> said:
> > >
> > >
> > >
> > >>Hi Aaron,
> > >>
> > >>
> > >>
> > >>>Do you have any way of narrowing this down to specific queries
> > >>>that are taking the longest and/or are being executed the most?
> > >>>That would identify which low-level database functions are being
> > >>>called, then we can just trace our way up the call chain to see
> > >>>who's misbehaving or acting on a flawed design. Also, if you
> > >>>could run similar tests against the latest 1.2, it would help to
> > >>>give a frame of reference, particularly for my delivery chain
> > >>>design.
> > >>>
> > >>>
> > >>That's simple: the main design flaw (actually that's no design
> > >>flaw I think, that's because the so called database MySQL couldn't

> > >>do transactions in the past) is that dbmail doesn't use
> > >>transaction.
> > >>
> > >>Because of that AutoCommit is used and whenever dbmail does
> > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that

> > >>is terribly slow. To ensure the Durability in ACID the database
> > >>has to fflush() every transaction to stable storage! That's why
> > >>there is only one solution: we have to use transaction.
> > >>
> > >>With transactions we could remove the integrity checks of
> > >>dbmail-maintenance too, because the database guarantees integrity.
> > >>
> > >>Anyway, I did a trace of all SQL queries when a mail is copied
> > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations
> > >>to insert one mail?).
> > >>
> > >>When searching for the sequential scan I found something quite
> > >>interesting in the docs: the planer decides for every scan if a
> > >>seqScan is cheaper that an index scan, and does a seqScan even if
> > >>an index
> > >>exists:
> > >>
> > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
owner_idnr=2;
> > >> QUERY PLAN
> > >>---------------------------------------------------------
> > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > >> Filter: (owner_idnr = 2)
> > >>(2 rows)
> > >>
> > >>The table has an index on owner_idnr.
> > >>
> > >>So I should repeat this test with a database with several hundred
> > >>to thousand user, several dozen mailboxes for each user and
> > >>several dozen mails in each mailbox to find out if all required
> > >>indizes are there. Did anyone write a script to create such a
> > >>database?
> > >>
> > >>But I found a strange query:
> > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> > >>owner_idnr = '2' mailbox_idnr is the primary key so that could be
> > >>optimized to: SELECT 4
> > >>;-)
> > >>
> > >>
> > >>I don't have a 1.2 installation, I'm sorry.
> > >>
> > >>
> > >>--
> > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
(95702B3B)
> > >>_______________________________________________
> > >>Dbmail-dev mailing list
> > >>Dbmail-dev@dbmail.org
> > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > >>
> > >>
> > >>
> > >
> > >
> > >
> > >
> > >
> >
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >
>
>

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
RE: some speed tests [ In reply to ]
The only question is if the keywords BEGIN and COMMIT are ignored or if they
generate an error. In the former case, no problems. In the latter case, we
have to either drop support for certain configurations or code around them.

Aaron


"John Hansen" <john@geeknet.com.au> said:

> I assume here that you talk about MySQL 4.
> MySQL 3 does not have transactions.
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 11:57 AM
> To: dbmail-dev@dbmail.org
> Subject: Re: [Dbmail-dev] some speed tests
>
>
> Forgive the bluntness of the statement, but why is anyone even worrying
> about transactions as they relate to MySQL???
>
> COMMIT in MySQL is passed to the table handler. In the case of MyISAM
> tables, the handler disregards the statement. For InnoDB and BDB tables,
>
> COMMIT acts as it does in PostgreSQL.
>
> So various people in this thread implying that MySQL isn't really a
> database need to do some more reading.
>
> In summary, just encapsulate everything in transaction blocks and the
> underlaying database will act appropriately.
>
> Regards,
>
> Chris
>
> Aaron Stone wrote:
>
> >I don't even know where to begin in terms of designing the delivery
> >chain around transactions. Could we do it as simply as adding
> >functions...
> >
> > void db_begin_transaction(void);
> > void db_flush_transaction(void); (or db_commit_...?)
> >
> >and then calling these functions before and after each major section of
>
> >database code? For the delivery chain, we could do it inside of
> >insert_message(). For dbmail-smtp, this basically means that the
> >execution of the whole program is within one transaction. For
> >dbmail-lmtpd, it means that each message is delivered within a
> >transaction but the miscellaneous queries before the main message
> >delivery chain are not transacted. For MySQL, these functions would be
> >noops.
> >
> >Thing that might work?
> >
> >Aaron
> >
> >
> >Thomas Mueller <dbmail@tmueller.com> said:
> >
> >
> >
> >>Hi Aaron,
> >>
> >>
> >>
> >>>Do you have any way of narrowing this down to specific queries that
> >>>are taking the longest and/or are being executed the most? That would
>
> >>>identify which low-level database functions are being called, then we
>
> >>>can just trace our way up the call chain to see who's misbehaving or
> >>>acting on a flawed design. Also, if you could run similar tests
> >>>against the latest 1.2, it would help to give a frame of reference,
> >>>particularly for my delivery chain design.
> >>>
> >>>
> >>That's simple: the main design flaw (actually that's no design flaw I
> >>think, that's because the so called database MySQL couldn't do
> >>transactions in the past) is that dbmail doesn't use transaction.
> >>
> >>Because of that AutoCommit is used and whenever dbmail does
> >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is
> >>terribly slow. To ensure the Durability in ACID the database has to
> >>fflush() every transaction to stable storage! That's why there is only
>
> >>one solution: we have to use transaction.
> >>
> >>With transactions we could remove the integrity checks of
> >>dbmail-maintenance too, because the database guarantees integrity.
> >>
> >>Anyway, I did a trace of all SQL queries when a mail is copied using
> >>IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
> >>one mail?).
> >>
> >>When searching for the sequential scan I found something quite
> >>interesting in the docs: the planer decides for every scan if a
> >>seqScan is cheaper that an index scan, and does a seqScan even if an
> >>index
> >>exists:
> >>
> >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> owner_idnr=2;
> >> QUERY PLAN
> >>---------------------------------------------------------
> >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> >> Filter: (owner_idnr = 2)
> >>(2 rows)
> >>
> >>The table has an index on owner_idnr.
> >>
> >>So I should repeat this test with a database with several hundred to
> >>thousand user, several dozen mailboxes for each user and several dozen
>
> >>mails in each mailbox to find out if all required indizes are there.
> >>Did anyone write a script to create such a database?
> >>
> >>But I found a strange query:
> >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> >>owner_idnr = '2' mailbox_idnr is the primary key so that could be
> >>optimized to: SELECT 4
> >>;-)
> >>
> >>
> >>I don't have a 1.2 installation, I'm sorry.
> >>
> >>
> >>--
> >>MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
> >>_______________________________________________
> >>Dbmail-dev mailing list
> >>Dbmail-dev@dbmail.org
> >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >>
> >>
> >>
> >
> >
> >
> >
> >
>
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>



--
RE: some speed tests [ In reply to ]
I think you can disagree with the statement without resenting it...

My experience with MySQL has been phenominal speed *and* reliability, though
at the expense of some pretty useful SQL features. If you've had trouble with
it, you can easily get lots of help from this list and others just by asking.

Bashing MySQL is fine, but please label it as such (possibly even being so
kind as to qualify your bashes with experience or reason). Instead, you've
gotten several people, myself included, into a thread arguing about all sorts
of things really just because of an infusion of bad karma that nobody needs.

Aaron


"John Hansen" <john@geeknet.com.au> said:

> Personally, I recent the statement; 'Considering MySQL has proven itself
> time and time again in terms of reliability, licencing flexibility and
> performance, such statements are baseless regardless of what arguments
> you make for which features'; based on experience, mysql proved that it
> was not reliable, tho performance was top of the line.
>
> In my opinion, if you're concerned with reliability, mysql should not be
> an option.
> I had daily table corruptions, and thus one of my main tasks was to run
> mysql repair jobs regularly, just to keep everything running.
>
> Granted, postgresql has it's own problems, mainly with performance and
> lack of a proper master / slave replication model, but that's a small
> price to pay for reliability.
>
> Just my $0.02 worth.
>
> Regards,
>
> John Hansen
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 4:42 PM
> To: dbmail-dev@dbmail.org
> Subject: Re: [Dbmail-dev] some speed tests
>
>
> Dear Aaron,
>
> My comments are also inline. :-)
>
> On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > Comments inline...
> >
> > Chris Nolan <chris@imi.com.au> said:
> >
> > > Forgive the bluntness of the statement, but why is anyone even
> > > worrying
> > > about transactions as they relate to MySQL???
> >
> > Because we are currently structured to have a single set of mid-level
> > database operations that are translated into specific low-level
> > database function calls. The upside is that there isn't any database
> > specific handling outside of these mid-level functions. The downside
> > is that we need to make sure that function calls for certain features
> > are consistent with analogous features in each database and that they
> > carry with them enough information to make the appropriate low-level
> > call. A good example is the last inserted id number. In MySQL, you
> > only need the database connection identifier to get this. In
> > PostgreSQL, yo need both the database connection and the table
> > identifier. For DBMail to have a mid-level function that worked for
> > both, we'd have to make sure that it took both arguments and used them
>
> > as needed for whichever database's low-level calls were being used.
> > (Those were off the top of my head, so they may be incorrect, but they
>
> > do illustrate my point.)
>
> I've looked through the DBMail source code on a few occassions and even
> released a dodgy tool to convert Cyrus mailboxes to DBMail a while ago.
> The fact that I can use the calls present in the DB modules and avoid
> having to write queries by hand for each different DB is an excellent
> feature of the DBMail API. I have nothing negative to say here at all.
>
> >
> > > COMMIT in MySQL is passed to the table handler. In the case of
> > > MyISAM
> > > tables, the handler disregards the statement. For InnoDB and BDB
> tables,
> > > COMMIT acts as it does in PostgreSQL.
> >
> > So then we have THREE different configurations to consider, and need
> > to be sure to design the mid-level interface appropriately.
>
> My statement was meant to say that the MyISAM table handler will just
> disregard BEGIN and COMMIT statements, just as it parses but ignores
> CHECK and FOREIGN KEY constraints in table creation, thus illustrating
> that you wouldn't need to add a mysql-with-transactions directory to the
> source tree.
>
> >
> > > So various people in this thread implying that MySQL isn't really a
> > > database need to do some more reading.
> >
> > If you're referring to my suggestion that the transaction functions
> > are a noop for MySQL, then you're reading too deeply... I just didn't
> > realize that InnoDB would handle transactions entirely normally.
>
> My apologies, I was not referring to you at all! Yourself and your
> collegues have provided the world with a very, very funky mail
> repository!
>
> I was referring to someone who said in the body of their message posted
> to another branch of this thread "the supposed database mysql".
> Considering MySQL has proven itself time and time again in terms of
> reliability, licencing flexibility and performance, such statements are
> baseless regardless of what arguments you make for which features. For
> example, Visual FoxPro's backend is technically a database but provides
> nothing in the way of a privilege system (you need write access to
> record locking, so filesystem-level controls are no good either).
>
> >
> > > In summary, just encapsulate everything in transaction blocks and
> > > the
> > > underlaying database will act appropriately.
> >
> > Right. That's what we're talking about.
> >
> > Is the entire point of your comment that we can safely use
> >
> > BEGIN;
> >
> > .. whatever dbmail does...
> >
> > COMMIT;
> >
> > regardless of whether or not we know if the host database actually
> > supports these keywords? You could have just said that.
>
> I could have, but then I would have risked putting an end to the thread.
> :-)
>
> Please accept my apology - I never meant to offend yourself or anyone
> else who contributes to DBMail. It seems my reading of the post I
> mentioned above caused me to be a bit heavier than I should have been.
>
> But yes, you have perfectly summarised the point of my comment.
> >
> > > Regards,
> > >
> > > Chris
> > >
> > > Aaron Stone wrote:
> > >
> > > >I don't even know where to begin in terms of designing the delivery
>
> > > >chain around transactions. Could we do it as simply as adding
> > > >functions...
> > > >
> > > > void db_begin_transaction(void);
> > > > void db_flush_transaction(void); (or db_commit_...?)
> > > >
> > > >and then calling these functions before and after each major
> > > >section of database code? For the delivery chain, we could do it
> > > >inside of insert_message(). For dbmail-smtp, this basically means
> > > >that the execution of the whole program is within one transaction.
> > > >For dbmail-lmtpd, it means that each message is delivered within a
> > > >transaction but the miscellaneous queries before the main message
> > > >delivery chain are not transacted. For MySQL, these functions would
>
> > > >be noops.
> > > >
> > > >Thing that might work?
> > > >
> > > >Aaron
> > > >
> > > >
> > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > >
> > > >
> > > >
> > > >>Hi Aaron,
> > > >>
> > > >>
> > > >>
> > > >>>Do you have any way of narrowing this down to specific queries
> > > >>>that are taking the longest and/or are being executed the most?
> > > >>>That would identify which low-level database functions are being
> > > >>>called, then we can just trace our way up the call chain to see
> > > >>>who's misbehaving or acting on a flawed design. Also, if you
> > > >>>could run similar tests against the latest 1.2, it would help to
> > > >>>give a frame of reference, particularly for my delivery chain
> > > >>>design.
> > > >>>
> > > >>>
> > > >>That's simple: the main design flaw (actually that's no design
> > > >>flaw I think, that's because the so called database MySQL couldn't
>
> > > >>do transactions in the past) is that dbmail doesn't use
> > > >>transaction.
> > > >>
> > > >>Because of that AutoCommit is used and whenever dbmail does
> > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that
>
> > > >>is terribly slow. To ensure the Durability in ACID the database
> > > >>has to fflush() every transaction to stable storage! That's why
> > > >>there is only one solution: we have to use transaction.
> > > >>
> > > >>With transactions we could remove the integrity checks of
> > > >>dbmail-maintenance too, because the database guarantees integrity.
> > > >>
> > > >>Anyway, I did a trace of all SQL queries when a mail is copied
> > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations
> > > >>to insert one mail?).
> > > >>
> > > >>When searching for the sequential scan I found something quite
> > > >>interesting in the docs: the planer decides for every scan if a
> > > >>seqScan is cheaper that an index scan, and does a seqScan even if
> > > >>an index
> > > >>exists:
> > > >>
> > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> owner_idnr=2;
> > > >> QUERY PLAN
> > > >>---------------------------------------------------------
> > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > >> Filter: (owner_idnr = 2)
> > > >>(2 rows)
> > > >>
> > > >>The table has an index on owner_idnr.
> > > >>
> > > >>So I should repeat this test with a database with several hundred
> > > >>to thousand user, several dozen mailboxes for each user and
> > > >>several dozen mails in each mailbox to find out if all required
> > > >>indizes are there. Did anyone write a script to create such a
> > > >>database?
> > > >>
> > > >>But I found a strange query:
> > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> > > >>owner_idnr = '2' mailbox_idnr is the primary key so that could be
> > > >>optimized to: SELECT 4
> > > >>;-)
> > > >>
> > > >>
> > > >>I don't have a 1.2 installation, I'm sorry.
> > > >>
> > > >>
> > > >>--
> > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> (95702B3B)
> > > >>_______________________________________________
> > > >>Dbmail-dev mailing list
> > > >>Dbmail-dev@dbmail.org
> > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > >>
> > > >>
> > > >>
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > >
> >
> >
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>



--
RE: some speed tests [ In reply to ]
Indeed I should have used a different word.

Mind you I didn't start this thread or state that mysql isn't a real
database,. I simply responded to what I saw as words of praise where
they in my opinion aren't warranted.

But everyone has a right to their own oppinion, and I say good on them
for those that can use mysql without running in to the same problems I
had.

As I said,. I feel mysql has it's place, just not in mission critical
applications.

Regards,

John Hansen

-----Original Message-----
From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
On Behalf Of Aaron Stone
Sent: Sunday, February 15, 2004 5:48 PM
To: dbmail-dev@dbmail.org
Subject: RE: [Dbmail-dev] some speed tests


I think you can disagree with the statement without resenting it...

My experience with MySQL has been phenominal speed *and* reliability,
though at the expense of some pretty useful SQL features. If you've had
trouble with it, you can easily get lots of help from this list and
others just by asking.

Bashing MySQL is fine, but please label it as such (possibly even being
so kind as to qualify your bashes with experience or reason). Instead,
you've gotten several people, myself included, into a thread arguing
about all sorts of things really just because of an infusion of bad
karma that nobody needs.

Aaron


"John Hansen" <john@geeknet.com.au> said:

> Personally, I recent the statement; 'Considering MySQL has proven
> itself time and time again in terms of reliability, licencing
> flexibility and performance, such statements are baseless regardless
> of what arguments you make for which features'; based on experience,
> mysql proved that it was not reliable, tho performance was top of the
> line.
>
> In my opinion, if you're concerned with reliability, mysql should not
> be an option. I had daily table corruptions, and thus one of my main
> tasks was to run mysql repair jobs regularly, just to keep everything
> running.
>
> Granted, postgresql has it's own problems, mainly with performance and

> lack of a proper master / slave replication model, but that's a small
> price to pay for reliability.
>
> Just my $0.02 worth.
>
> Regards,
>
> John Hansen
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 4:42 PM
> To: dbmail-dev@dbmail.org
> Subject: Re: [Dbmail-dev] some speed tests
>
>
> Dear Aaron,
>
> My comments are also inline. :-)
>
> On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > Comments inline...
> >
> > Chris Nolan <chris@imi.com.au> said:
> >
> > > Forgive the bluntness of the statement, but why is anyone even
> > > worrying
> > > about transactions as they relate to MySQL???
> >
> > Because we are currently structured to have a single set of
> > mid-level
> > database operations that are translated into specific low-level
> > database function calls. The upside is that there isn't any database

> > specific handling outside of these mid-level functions. The downside

> > is that we need to make sure that function calls for certain
features
> > are consistent with analogous features in each database and that
they
> > carry with them enough information to make the appropriate low-level

> > call. A good example is the last inserted id number. In MySQL, you
> > only need the database connection identifier to get this. In
> > PostgreSQL, yo need both the database connection and the table
> > identifier. For DBMail to have a mid-level function that worked for
> > both, we'd have to make sure that it took both arguments and used
them
>
> > as needed for whichever database's low-level calls were being used.
> > (Those were off the top of my head, so they may be incorrect, but
they
>
> > do illustrate my point.)
>
> I've looked through the DBMail source code on a few occassions and
> even released a dodgy tool to convert Cyrus mailboxes to DBMail a
> while ago. The fact that I can use the calls present in the DB modules

> and avoid having to write queries by hand for each different DB is an
> excellent feature of the DBMail API. I have nothing negative to say
> here at all.
>
> >
> > > COMMIT in MySQL is passed to the table handler. In the case of
> > > MyISAM
> > > tables, the handler disregards the statement. For InnoDB and BDB
> tables,
> > > COMMIT acts as it does in PostgreSQL.
> >
> > So then we have THREE different configurations to consider, and need
> > to be sure to design the mid-level interface appropriately.
>
> My statement was meant to say that the MyISAM table handler will just
> disregard BEGIN and COMMIT statements, just as it parses but ignores
> CHECK and FOREIGN KEY constraints in table creation, thus illustrating

> that you wouldn't need to add a mysql-with-transactions directory to
> the source tree.
>
> >
> > > So various people in this thread implying that MySQL isn't really
> > > a database need to do some more reading.
> >
> > If you're referring to my suggestion that the transaction functions
> > are a noop for MySQL, then you're reading too deeply... I just
didn't
> > realize that InnoDB would handle transactions entirely normally.
>
> My apologies, I was not referring to you at all! Yourself and your
> collegues have provided the world with a very, very funky mail
> repository!
>
> I was referring to someone who said in the body of their message
> posted to another branch of this thread "the supposed database mysql".

> Considering MySQL has proven itself time and time again in terms of
> reliability, licencing flexibility and performance, such statements
> are baseless regardless of what arguments you make for which features.

> For example, Visual FoxPro's backend is technically a database but
> provides nothing in the way of a privilege system (you need write
> access to record locking, so filesystem-level controls are no good
> either).
>
> >
> > > In summary, just encapsulate everything in transaction blocks and
> > > the
> > > underlaying database will act appropriately.
> >
> > Right. That's what we're talking about.
> >
> > Is the entire point of your comment that we can safely use
> >
> > BEGIN;
> >
> > .. whatever dbmail does...
> >
> > COMMIT;
> >
> > regardless of whether or not we know if the host database actually
> > supports these keywords? You could have just said that.
>
> I could have, but then I would have risked putting an end to the
> thread.
> :-)
>
> Please accept my apology - I never meant to offend yourself or anyone
> else who contributes to DBMail. It seems my reading of the post I
> mentioned above caused me to be a bit heavier than I should have been.
>
> But yes, you have perfectly summarised the point of my comment.
> >
> > > Regards,
> > >
> > > Chris
> > >
> > > Aaron Stone wrote:
> > >
> > > >I don't even know where to begin in terms of designing the
> > > >delivery
>
> > > >chain around transactions. Could we do it as simply as adding
> > > >functions...
> > > >
> > > > void db_begin_transaction(void);
> > > > void db_flush_transaction(void); (or db_commit_...?)
> > > >
> > > >and then calling these functions before and after each major
> > > >section of database code? For the delivery chain, we could do it
> > > >inside of insert_message(). For dbmail-smtp, this basically means

> > > >that the execution of the whole program is within one
transaction.
> > > >For dbmail-lmtpd, it means that each message is delivered within
a
> > > >transaction but the miscellaneous queries before the main message

> > > >delivery chain are not transacted. For MySQL, these functions
would
>
> > > >be noops.
> > > >
> > > >Thing that might work?
> > > >
> > > >Aaron
> > > >
> > > >
> > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > >
> > > >
> > > >
> > > >>Hi Aaron,
> > > >>
> > > >>
> > > >>
> > > >>>Do you have any way of narrowing this down to specific queries
> > > >>>that are taking the longest and/or are being executed the most?

> > > >>>That would identify which low-level database functions are
being
> > > >>>called, then we can just trace our way up the call chain to see

> > > >>>who's misbehaving or acting on a flawed design. Also, if you
> > > >>>could run similar tests against the latest 1.2, it would help
to
> > > >>>give a frame of reference, particularly for my delivery chain
> > > >>>design.
> > > >>>
> > > >>>
> > > >>That's simple: the main design flaw (actually that's no design
> > > >>flaw I think, that's because the so called database MySQL
couldn't
>
> > > >>do transactions in the past) is that dbmail doesn't use
> > > >>transaction.
> > > >>
> > > >>Because of that AutoCommit is used and whenever dbmail does
> > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and
that
>
> > > >>is terribly slow. To ensure the Durability in ACID the database
> > > >>has to fflush() every transaction to stable storage! That's why
> > > >>there is only one solution: we have to use transaction.
> > > >>
> > > >>With transactions we could remove the integrity checks of
> > > >>dbmail-maintenance too, because the database guarantees
integrity.
> > > >>
> > > >>Anyway, I did a trace of all SQL queries when a mail is copied
> > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db
operations
> > > >>to insert one mail?).
> > > >>
> > > >>When searching for the sequential scan I found something quite
> > > >>interesting in the docs: the planer decides for every scan if a
> > > >>seqScan is cheaper that an index scan, and does a seqScan even
if
> > > >>an index
> > > >>exists:
> > > >>
> > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> owner_idnr=2;
> > > >> QUERY PLAN
> > > >>---------------------------------------------------------
> > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > >> Filter: (owner_idnr = 2)
> > > >>(2 rows)
> > > >>
> > > >>The table has an index on owner_idnr.
> > > >>
> > > >>So I should repeat this test with a database with several
> > > >>hundred
> > > >>to thousand user, several dozen mailboxes for each user and
> > > >>several dozen mails in each mailbox to find out if all required
> > > >>indizes are there. Did anyone write a script to create such a
> > > >>database?
> > > >>
> > > >>But I found a strange query:
> > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> > > >>owner_idnr = '2' mailbox_idnr is the primary key so that could
be
> > > >>optimized to: SELECT 4
> > > >>;-)
> > > >>
> > > >>
> > > >>I don't have a 1.2 installation, I'm sorry.
> > > >>
> > > >>
> > > >>--
> > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> (95702B3B)
> > > >>_______________________________________________
> > > >>Dbmail-dev mailing list
> > > >>Dbmail-dev@dbmail.org
> > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > >>
> > > >>
> > > >>
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > >
> >
> >
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>



--



_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
RE: some speed tests [ In reply to ]
Also,...

If I have offended you in anyway, I apologise.
It was not my intention.


... John

-----Original Message-----
From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
On Behalf Of Aaron Stone
Sent: Sunday, February 15, 2004 5:48 PM
To: dbmail-dev@dbmail.org
Subject: RE: [Dbmail-dev] some speed tests


I think you can disagree with the statement without resenting it...

My experience with MySQL has been phenominal speed *and* reliability,
though at the expense of some pretty useful SQL features. If you've had
trouble with it, you can easily get lots of help from this list and
others just by asking.

Bashing MySQL is fine, but please label it as such (possibly even being
so kind as to qualify your bashes with experience or reason). Instead,
you've gotten several people, myself included, into a thread arguing
about all sorts of things really just because of an infusion of bad
karma that nobody needs.

Aaron


"John Hansen" <john@geeknet.com.au> said:

> Personally, I recent the statement; 'Considering MySQL has proven
> itself time and time again in terms of reliability, licencing
> flexibility and performance, such statements are baseless regardless
> of what arguments you make for which features'; based on experience,
> mysql proved that it was not reliable, tho performance was top of the
> line.
>
> In my opinion, if you're concerned with reliability, mysql should not
> be an option. I had daily table corruptions, and thus one of my main
> tasks was to run mysql repair jobs regularly, just to keep everything
> running.
>
> Granted, postgresql has it's own problems, mainly with performance and
> lack of a proper master / slave replication model, but that's a small
> price to pay for reliability.
>
> Just my $0.02 worth.
>
> Regards,
>
> John Hansen
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 4:42 PM
> To: dbmail-dev@dbmail.org
> Subject: Re: [Dbmail-dev] some speed tests
>
>
> Dear Aaron,
>
> My comments are also inline. :-)
>
> On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > Comments inline...
> >
> > Chris Nolan <chris@imi.com.au> said:
> >
> > > Forgive the bluntness of the statement, but why is anyone even
> > > worrying about transactions as they relate to MySQL???
> >
> > Because we are currently structured to have a single set of
> > mid-level
> > database operations that are translated into specific low-level
> > database function calls. The upside is that there isn't any database

> > specific handling outside of these mid-level functions. The downside

> > is that we need to make sure that function calls for certain
features
> > are consistent with analogous features in each database and that
they
> > carry with them enough information to make the appropriate low-level

> > call. A good example is the last inserted id number. In MySQL, you
> > only need the database connection identifier to get this. In
> > PostgreSQL, yo need both the database connection and the table
> > identifier. For DBMail to have a mid-level function that worked for
> > both, we'd have to make sure that it took both arguments and used
them
>
> > as needed for whichever database's low-level calls were being used.
> > (Those were off the top of my head, so they may be incorrect, but
> > they
>
> > do illustrate my point.)
>
> I've looked through the DBMail source code on a few occassions and
> even released a dodgy tool to convert Cyrus mailboxes to DBMail a
> while ago. The fact that I can use the calls present in the DB modules

> and avoid having to write queries by hand for each different DB is an
> excellent feature of the DBMail API. I have nothing negative to say
> here at all.
>
> >
> > > COMMIT in MySQL is passed to the table handler. In the case of
> > > MyISAM tables, the handler disregards the statement. For InnoDB
> > > and BDB
> tables,
> > > COMMIT acts as it does in PostgreSQL.
> >
> > So then we have THREE different configurations to consider, and need

> > to be sure to design the mid-level interface appropriately.
>
> My statement was meant to say that the MyISAM table handler will just
> disregard BEGIN and COMMIT statements, just as it parses but ignores
> CHECK and FOREIGN KEY constraints in table creation, thus illustrating

> that you wouldn't need to add a mysql-with-transactions directory to
> the source tree.
>
> >
> > > So various people in this thread implying that MySQL isn't really
> > > a database need to do some more reading.
> >
> > If you're referring to my suggestion that the transaction functions
> > are a noop for MySQL, then you're reading too deeply... I just
> > didn't realize that InnoDB would handle transactions entirely
> > normally.
>
> My apologies, I was not referring to you at all! Yourself and your
> collegues have provided the world with a very, very funky mail
> repository!
>
> I was referring to someone who said in the body of their message
> posted to another branch of this thread "the supposed database mysql".

> Considering MySQL has proven itself time and time again in terms of
> reliability, licencing flexibility and performance, such statements
> are baseless regardless of what arguments you make for which features.

> For example, Visual FoxPro's backend is technically a database but
> provides nothing in the way of a privilege system (you need write
> access to record locking, so filesystem-level controls are no good
> either).
>
> >
> > > In summary, just encapsulate everything in transaction blocks and
> > > the underlaying database will act appropriately.
> >
> > Right. That's what we're talking about.
> >
> > Is the entire point of your comment that we can safely use
> >
> > BEGIN;
> >
> > .. whatever dbmail does...
> >
> > COMMIT;
> >
> > regardless of whether or not we know if the host database actually
> > supports these keywords? You could have just said that.
>
> I could have, but then I would have risked putting an end to the
> thread.
> :-)
>
> Please accept my apology - I never meant to offend yourself or anyone
> else who contributes to DBMail. It seems my reading of the post I
> mentioned above caused me to be a bit heavier than I should have been.
>
> But yes, you have perfectly summarised the point of my comment.
> >
> > > Regards,
> > >
> > > Chris
> > >
> > > Aaron Stone wrote:
> > >
> > > >I don't even know where to begin in terms of designing the
> > > >delivery
>
> > > >chain around transactions. Could we do it as simply as adding
> > > >functions...
> > > >
> > > > void db_begin_transaction(void);
> > > > void db_flush_transaction(void); (or db_commit_...?)
> > > >
> > > >and then calling these functions before and after each major
> > > >section of database code? For the delivery chain, we could do it
> > > >inside of insert_message(). For dbmail-smtp, this basically means

> > > >that the execution of the whole program is within one
> > > >transaction. For dbmail-lmtpd, it means that each message is
> > > >delivered within a transaction but the miscellaneous queries
> > > >before the main message delivery chain are not transacted. For
> > > >MySQL, these functions would
>
> > > >be noops.
> > > >
> > > >Thing that might work?
> > > >
> > > >Aaron
> > > >
> > > >
> > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > >
> > > >
> > > >
> > > >>Hi Aaron,
> > > >>
> > > >>
> > > >>
> > > >>>Do you have any way of narrowing this down to specific queries
> > > >>>that are taking the longest and/or are being executed the most?

> > > >>>That would identify which low-level database functions are
> > > >>>being called, then we can just trace our way up the call chain
> > > >>>to see who's misbehaving or acting on a flawed design. Also, if

> > > >>>you could run similar tests against the latest 1.2, it would
> > > >>>help to give a frame of reference, particularly for my delivery

> > > >>>chain design.
> > > >>>
> > > >>>
> > > >>That's simple: the main design flaw (actually that's no design
> > > >>flaw I think, that's because the so called database MySQL
> > > >>couldn't
>
> > > >>do transactions in the past) is that dbmail doesn't use
> > > >>transaction.
> > > >>
> > > >>Because of that AutoCommit is used and whenever dbmail does
> > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and
> > > >>that
>
> > > >>is terribly slow. To ensure the Durability in ACID the database
> > > >>has to fflush() every transaction to stable storage! That's why
> > > >>there is only one solution: we have to use transaction.
> > > >>
> > > >>With transactions we could remove the integrity checks of
> > > >>dbmail-maintenance too, because the database guarantees
> > > >>integrity.
> > > >>
> > > >>Anyway, I did a trace of all SQL queries when a mail is copied
> > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db
> > > >>operations to insert one mail?).
> > > >>
> > > >>When searching for the sequential scan I found something quite
> > > >>interesting in the docs: the planer decides for every scan if a
> > > >>seqScan is cheaper that an index scan, and does a seqScan even
> > > >>if an index
> > > >>exists:
> > > >>
> > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> owner_idnr=2;
> > > >> QUERY PLAN
> > > >>---------------------------------------------------------
> > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > >> Filter: (owner_idnr = 2)
> > > >>(2 rows)
> > > >>
> > > >>The table has an index on owner_idnr.
> > > >>
> > > >>So I should repeat this test with a database with several
> > > >>hundred
> > > >>to thousand user, several dozen mailboxes for each user and
> > > >>several dozen mails in each mailbox to find out if all required
> > > >>indizes are there. Did anyone write a script to create such a
> > > >>database?
> > > >>
> > > >>But I found a strange query:
> > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> > > >>owner_idnr = '2' mailbox_idnr is the primary key so that could
> > > >>be optimized to: SELECT 4
> > > >>;-)
> > > >>
> > > >>
> > > >>I don't have a 1.2 installation, I'm sorry.
> > > >>
> > > >>
> > > >>--
> > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> (95702B3B)
> > > >>_______________________________________________
> > > >>Dbmail-dev mailing list
> > > >>Dbmail-dev@dbmail.org
> > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > >>
> > > >>
> > > >>
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > >
> >
> >
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>



--



_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
RE: some speed tests [ In reply to ]
Table corruptions?

I'd be very interested to hear the story around your grief, if for
nothing else, for reference purposes. :-)

Regards,

Chris

On Sun, 2004-02-15 at 17:19, John Hansen wrote:
> Personally, I recent the statement; 'Considering MySQL has proven itself
> time and time again in terms of reliability, licencing flexibility and
> performance, such statements are baseless regardless of what arguments
> you make for which features'; based on experience, mysql proved that it
> was not reliable, tho performance was top of the line.
>
> In my opinion, if you're concerned with reliability, mysql should not be
> an option.
> I had daily table corruptions, and thus one of my main tasks was to run
> mysql repair jobs regularly, just to keep everything running.
>
> Granted, postgresql has it's own problems, mainly with performance and
> lack of a proper master / slave replication model, but that's a small
> price to pay for reliability.
>
> Just my $0.02 worth.
>
> Regards,
>
> John Hansen
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 4:42 PM
> To: dbmail-dev@dbmail.org
> Subject: Re: [Dbmail-dev] some speed tests
>
>
> Dear Aaron,
>
> My comments are also inline. :-)
>
> On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > Comments inline...
> >
> > Chris Nolan <chris@imi.com.au> said:
> >
> > > Forgive the bluntness of the statement, but why is anyone even
> > > worrying
> > > about transactions as they relate to MySQL???
> >
> > Because we are currently structured to have a single set of mid-level
> > database operations that are translated into specific low-level
> > database function calls. The upside is that there isn't any database
> > specific handling outside of these mid-level functions. The downside
> > is that we need to make sure that function calls for certain features
> > are consistent with analogous features in each database and that they
> > carry with them enough information to make the appropriate low-level
> > call. A good example is the last inserted id number. In MySQL, you
> > only need the database connection identifier to get this. In
> > PostgreSQL, yo need both the database connection and the table
> > identifier. For DBMail to have a mid-level function that worked for
> > both, we'd have to make sure that it took both arguments and used them
>
> > as needed for whichever database's low-level calls were being used.
> > (Those were off the top of my head, so they may be incorrect, but they
>
> > do illustrate my point.)
>
> I've looked through the DBMail source code on a few occassions and even
> released a dodgy tool to convert Cyrus mailboxes to DBMail a while ago.
> The fact that I can use the calls present in the DB modules and avoid
> having to write queries by hand for each different DB is an excellent
> feature of the DBMail API. I have nothing negative to say here at all.
>
> >
> > > COMMIT in MySQL is passed to the table handler. In the case of
> > > MyISAM
> > > tables, the handler disregards the statement. For InnoDB and BDB
> tables,
> > > COMMIT acts as it does in PostgreSQL.
> >
> > So then we have THREE different configurations to consider, and need
> > to be sure to design the mid-level interface appropriately.
>
> My statement was meant to say that the MyISAM table handler will just
> disregard BEGIN and COMMIT statements, just as it parses but ignores
> CHECK and FOREIGN KEY constraints in table creation, thus illustrating
> that you wouldn't need to add a mysql-with-transactions directory to the
> source tree.
>
> >
> > > So various people in this thread implying that MySQL isn't really a
> > > database need to do some more reading.
> >
> > If you're referring to my suggestion that the transaction functions
> > are a noop for MySQL, then you're reading too deeply... I just didn't
> > realize that InnoDB would handle transactions entirely normally.
>
> My apologies, I was not referring to you at all! Yourself and your
> collegues have provided the world with a very, very funky mail
> repository!
>
> I was referring to someone who said in the body of their message posted
> to another branch of this thread "the supposed database mysql".
> Considering MySQL has proven itself time and time again in terms of
> reliability, licencing flexibility and performance, such statements are
> baseless regardless of what arguments you make for which features. For
> example, Visual FoxPro's backend is technically a database but provides
> nothing in the way of a privilege system (you need write access to
> record locking, so filesystem-level controls are no good either).
>
> >
> > > In summary, just encapsulate everything in transaction blocks and
> > > the
> > > underlaying database will act appropriately.
> >
> > Right. That's what we're talking about.
> >
> > Is the entire point of your comment that we can safely use
> >
> > BEGIN;
> >
> > .. whatever dbmail does...
> >
> > COMMIT;
> >
> > regardless of whether or not we know if the host database actually
> > supports these keywords? You could have just said that.
>
> I could have, but then I would have risked putting an end to the thread.
> :-)
>
> Please accept my apology - I never meant to offend yourself or anyone
> else who contributes to DBMail. It seems my reading of the post I
> mentioned above caused me to be a bit heavier than I should have been.
>
> But yes, you have perfectly summarised the point of my comment.
> >
> > > Regards,
> > >
> > > Chris
> > >
> > > Aaron Stone wrote:
> > >
> > > >I don't even know where to begin in terms of designing the delivery
>
> > > >chain around transactions. Could we do it as simply as adding
> > > >functions...
> > > >
> > > > void db_begin_transaction(void);
> > > > void db_flush_transaction(void); (or db_commit_...?)
> > > >
> > > >and then calling these functions before and after each major
> > > >section of database code? For the delivery chain, we could do it
> > > >inside of insert_message(). For dbmail-smtp, this basically means
> > > >that the execution of the whole program is within one transaction.
> > > >For dbmail-lmtpd, it means that each message is delivered within a
> > > >transaction but the miscellaneous queries before the main message
> > > >delivery chain are not transacted. For MySQL, these functions would
>
> > > >be noops.
> > > >
> > > >Thing that might work?
> > > >
> > > >Aaron
> > > >
> > > >
> > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > >
> > > >
> > > >
> > > >>Hi Aaron,
> > > >>
> > > >>
> > > >>
> > > >>>Do you have any way of narrowing this down to specific queries
> > > >>>that are taking the longest and/or are being executed the most?
> > > >>>That would identify which low-level database functions are being
> > > >>>called, then we can just trace our way up the call chain to see
> > > >>>who's misbehaving or acting on a flawed design. Also, if you
> > > >>>could run similar tests against the latest 1.2, it would help to
> > > >>>give a frame of reference, particularly for my delivery chain
> > > >>>design.
> > > >>>
> > > >>>
> > > >>That's simple: the main design flaw (actually that's no design
> > > >>flaw I think, that's because the so called database MySQL couldn't
>
> > > >>do transactions in the past) is that dbmail doesn't use
> > > >>transaction.
> > > >>
> > > >>Because of that AutoCommit is used and whenever dbmail does
> > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that
>
> > > >>is terribly slow. To ensure the Durability in ACID the database
> > > >>has to fflush() every transaction to stable storage! That's why
> > > >>there is only one solution: we have to use transaction.
> > > >>
> > > >>With transactions we could remove the integrity checks of
> > > >>dbmail-maintenance too, because the database guarantees integrity.
> > > >>
> > > >>Anyway, I did a trace of all SQL queries when a mail is copied
> > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations
> > > >>to insert one mail?).
> > > >>
> > > >>When searching for the sequential scan I found something quite
> > > >>interesting in the docs: the planer decides for every scan if a
> > > >>seqScan is cheaper that an index scan, and does a seqScan even if
> > > >>an index
> > > >>exists:
> > > >>
> > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> owner_idnr=2;
> > > >> QUERY PLAN
> > > >>---------------------------------------------------------
> > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > >> Filter: (owner_idnr = 2)
> > > >>(2 rows)
> > > >>
> > > >>The table has an index on owner_idnr.
> > > >>
> > > >>So I should repeat this test with a database with several hundred
> > > >>to thousand user, several dozen mailboxes for each user and
> > > >>several dozen mails in each mailbox to find out if all required
> > > >>indizes are there. Did anyone write a script to create such a
> > > >>database?
> > > >>
> > > >>But I found a strange query:
> > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> > > >>owner_idnr = '2' mailbox_idnr is the primary key so that could be
> > > >>optimized to: SELECT 4
> > > >>;-)
> > > >>
> > > >>
> > > >>I don't have a 1.2 installation, I'm sorry.
> > > >>
> > > >>
> > > >>--
> > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> (95702B3B)
> > > >>_______________________________________________
> > > >>Dbmail-dev mailing list
> > > >>Dbmail-dev@dbmail.org
> > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > >>
> > > >>
> > > >>
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > >
> >
> >
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
RE: some speed tests [ In reply to ]
The keywords are ignored, with one small exception:

If you have a transaction that updates both InnoDB/BDB tables and MyISAM
tables, a ROLLBACK statement will result in a warning message.

Regards,

Chris

On Sun, 2004-02-15 at 17:38, Aaron Stone wrote:
> The only question is if the keywords BEGIN and COMMIT are ignored or if they
> generate an error. In the former case, no problems. In the latter case, we
> have to either drop support for certain configurations or code around them.
>
> Aaron
>
>
> "John Hansen" <john@geeknet.com.au> said:
>
> > I assume here that you talk about MySQL 4.
> > MySQL 3 does not have transactions.
> >
> > -----Original Message-----
> > From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> > On Behalf Of Chris Nolan
> > Sent: Sunday, February 15, 2004 11:57 AM
> > To: dbmail-dev@dbmail.org
> > Subject: Re: [Dbmail-dev] some speed tests
> >
> >
> > Forgive the bluntness of the statement, but why is anyone even worrying
> > about transactions as they relate to MySQL???
> >
> > COMMIT in MySQL is passed to the table handler. In the case of MyISAM
> > tables, the handler disregards the statement. For InnoDB and BDB tables,
> >
> > COMMIT acts as it does in PostgreSQL.
> >
> > So various people in this thread implying that MySQL isn't really a
> > database need to do some more reading.
> >
> > In summary, just encapsulate everything in transaction blocks and the
> > underlaying database will act appropriately.
> >
> > Regards,
> >
> > Chris
> >
> > Aaron Stone wrote:
> >
> > >I don't even know where to begin in terms of designing the delivery
> > >chain around transactions. Could we do it as simply as adding
> > >functions...
> > >
> > > void db_begin_transaction(void);
> > > void db_flush_transaction(void); (or db_commit_...?)
> > >
> > >and then calling these functions before and after each major section of
> >
> > >database code? For the delivery chain, we could do it inside of
> > >insert_message(). For dbmail-smtp, this basically means that the
> > >execution of the whole program is within one transaction. For
> > >dbmail-lmtpd, it means that each message is delivered within a
> > >transaction but the miscellaneous queries before the main message
> > >delivery chain are not transacted. For MySQL, these functions would be
> > >noops.
> > >
> > >Thing that might work?
> > >
> > >Aaron
> > >
> > >
> > >Thomas Mueller <dbmail@tmueller.com> said:
> > >
> > >
> > >
> > >>Hi Aaron,
> > >>
> > >>
> > >>
> > >>>Do you have any way of narrowing this down to specific queries that
> > >>>are taking the longest and/or are being executed the most? That would
> >
> > >>>identify which low-level database functions are being called, then we
> >
> > >>>can just trace our way up the call chain to see who's misbehaving or
> > >>>acting on a flawed design. Also, if you could run similar tests
> > >>>against the latest 1.2, it would help to give a frame of reference,
> > >>>particularly for my delivery chain design.
> > >>>
> > >>>
> > >>That's simple: the main design flaw (actually that's no design flaw I
> > >>think, that's because the so called database MySQL couldn't do
> > >>transactions in the past) is that dbmail doesn't use transaction.
> > >>
> > >>Because of that AutoCommit is used and whenever dbmail does
> > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is
> > >>terribly slow. To ensure the Durability in ACID the database has to
> > >>fflush() every transaction to stable storage! That's why there is only
> >
> > >>one solution: we have to use transaction.
> > >>
> > >>With transactions we could remove the integrity checks of
> > >>dbmail-maintenance too, because the database guarantees integrity.
> > >>
> > >>Anyway, I did a trace of all SQL queries when a mail is copied using
> > >>IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
> > >>one mail?).
> > >>
> > >>When searching for the sequential scan I found something quite
> > >>interesting in the docs: the planer decides for every scan if a
> > >>seqScan is cheaper that an index scan, and does a seqScan even if an
> > >>index
> > >>exists:
> > >>
> > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> > owner_idnr=2;
> > >> QUERY PLAN
> > >>---------------------------------------------------------
> > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > >> Filter: (owner_idnr = 2)
> > >>(2 rows)
> > >>
> > >>The table has an index on owner_idnr.
> > >>
> > >>So I should repeat this test with a database with several hundred to
> > >>thousand user, several dozen mailboxes for each user and several dozen
> >
> > >>mails in each mailbox to find out if all required indizes are there.
> > >>Did anyone write a script to create such a database?
> > >>
> > >>But I found a strange query:
> > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> > >>owner_idnr = '2' mailbox_idnr is the primary key so that could be
> > >>optimized to: SELECT 4
> > >>;-)
> > >>
> > >>
> > >>I don't have a 1.2 installation, I'm sorry.
> > >>
> > >>
> > >>--
> > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
> > >>_______________________________________________
> > >>Dbmail-dev mailing list
> > >>Dbmail-dev@dbmail.org
> > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > >>
> > >>
> > >>
> > >
> > >
> > >
> > >
> > >
> >
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >
>
>
RE: some speed tests [ In reply to ]
MySQL 3.23.x does have transactions in the MAX version through the
InnoDB and BDB table types (and Gemini if you can find it anywhere).

InnoDB ships standard with MySQL 4.0.x and BDB is included in the MAX
version.

Of course, if you compile from the source, you can choose transactional
table type inclusion yourself.

Best regards,

Chris

On Sun, 2004-02-15 at 16:51, John Hansen wrote:
> I assume here that you talk about MySQL 4.
> MySQL 3 does not have transactions.
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 11:57 AM
> To: dbmail-dev@dbmail.org
> Subject: Re: [Dbmail-dev] some speed tests
>
>
> Forgive the bluntness of the statement, but why is anyone even worrying
> about transactions as they relate to MySQL???
>
> COMMIT in MySQL is passed to the table handler. In the case of MyISAM
> tables, the handler disregards the statement. For InnoDB and BDB tables,
>
> COMMIT acts as it does in PostgreSQL.
>
> So various people in this thread implying that MySQL isn't really a
> database need to do some more reading.
>
> In summary, just encapsulate everything in transaction blocks and the
> underlaying database will act appropriately.
>
> Regards,
>
> Chris
>
> Aaron Stone wrote:
>
> >I don't even know where to begin in terms of designing the delivery
> >chain around transactions. Could we do it as simply as adding
> >functions...
> >
> > void db_begin_transaction(void);
> > void db_flush_transaction(void); (or db_commit_...?)
> >
> >and then calling these functions before and after each major section of
>
> >database code? For the delivery chain, we could do it inside of
> >insert_message(). For dbmail-smtp, this basically means that the
> >execution of the whole program is within one transaction. For
> >dbmail-lmtpd, it means that each message is delivered within a
> >transaction but the miscellaneous queries before the main message
> >delivery chain are not transacted. For MySQL, these functions would be
> >noops.
> >
> >Thing that might work?
> >
> >Aaron
> >
> >
> >Thomas Mueller <dbmail@tmueller.com> said:
> >
> >
> >
> >>Hi Aaron,
> >>
> >>
> >>
> >>>Do you have any way of narrowing this down to specific queries that
> >>>are taking the longest and/or are being executed the most? That would
>
> >>>identify which low-level database functions are being called, then we
>
> >>>can just trace our way up the call chain to see who's misbehaving or
> >>>acting on a flawed design. Also, if you could run similar tests
> >>>against the latest 1.2, it would help to give a frame of reference,
> >>>particularly for my delivery chain design.
> >>>
> >>>
> >>That's simple: the main design flaw (actually that's no design flaw I
> >>think, that's because the so called database MySQL couldn't do
> >>transactions in the past) is that dbmail doesn't use transaction.
> >>
> >>Because of that AutoCommit is used and whenever dbmail does
> >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and that is
> >>terribly slow. To ensure the Durability in ACID the database has to
> >>fflush() every transaction to stable storage! That's why there is only
>
> >>one solution: we have to use transaction.
> >>
> >>With transactions we could remove the integrity checks of
> >>dbmail-maintenance too, because the database guarantees integrity.
> >>
> >>Anyway, I did a trace of all SQL queries when a mail is copied using
> >>IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db operations to insert
> >>one mail?).
> >>
> >>When searching for the sequential scan I found something quite
> >>interesting in the docs: the planer decides for every scan if a
> >>seqScan is cheaper that an index scan, and does a seqScan even if an
> >>index
> >>exists:
> >>
> >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> owner_idnr=2;
> >> QUERY PLAN
> >>---------------------------------------------------------
> >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> >> Filter: (owner_idnr = 2)
> >>(2 rows)
> >>
> >>The table has an index on owner_idnr.
> >>
> >>So I should repeat this test with a database with several hundred to
> >>thousand user, several dozen mailboxes for each user and several dozen
>
> >>mails in each mailbox to find out if all required indizes are there.
> >>Did anyone write a script to create such a database?
> >>
> >>But I found a strange query:
> >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> >>owner_idnr = '2' mailbox_idnr is the primary key so that could be
> >>optimized to: SELECT 4
> >>;-)
> >>
> >>
> >>I don't have a 1.2 installation, I'm sorry.
> >>
> >>
> >>--
> >>MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
> >>_______________________________________________
> >>Dbmail-dev mailing list
> >>Dbmail-dev@dbmail.org
> >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >>
> >>
> >>
> >
> >
> >
> >
> >
>
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
Re: some speed tests [ In reply to ]
Hi Chris,

> > > COMMIT in MySQL is passed to the table handler. In the case of MyISAM
> > > tables, the handler disregards the statement. For InnoDB and BDB tables,
> > > COMMIT acts as it does in PostgreSQL.
> >
> > So then we have THREE different configurations to consider, and need to be
> > sure to design the mid-level interface appropriately.
>
> My statement was meant to say that the MyISAM table handler will just
> disregard BEGIN and COMMIT statements, just as it parses but ignores
> CHECK and FOREIGN KEY constraints in table creation, thus illustrating
> that you wouldn't need to add a mysql-with-transactions directory to the
> source tree.

But don't we have to differentiate between a database with and a
database without transactions?
If we have transactions we can simply do all the database stuff, if
anything fails we do the rollback and bail out. We still have a
perfectly consistent database.
Without transactions we have to remember every successful database
modification we made and have to revert that.

That would complicate everything a lot. Is it possible to drop
MySQL-without-transactions support?

> I was referring to someone who said in the body of their message posted
> to another branch of this thread "the supposed database mysql".
> Considering MySQL has proven itself time and time again in terms of
> reliability, licencing flexibility and performance, such statements are
> baseless regardless of what arguments you make for which features.

That was me, I'm sorry! I didn't want to start a MySQL bashing thread. I
had lot of problems with MySQL in the past, that's why I said that. In my
eyes a database should support some basic things to have the right to
carry the name SQL database. Among these are transactions, foreign key
constraints, stored procedures, trigger, subselects. MySQL had not even
one of these features (today they probably have most of these features)
leading to poorly designed software, where the application had to do lot
of work the database should do (a good example in dbmail are the quotas
- the database could and should calculate that on its own).
The case insensitiveness of MySQL is another point where I had lot of
problems in the past. I only know two case insensitive databases: MySQL
and MS Access. Every other database is case sensitive (Postgres, Oracle,
MS SQL, what-ever).
And one should never forget the dual license of MySQL.
On the other hand people tell me that MySQL is faster than Postgres -
that's simply wrong. It's only faster if you use the database type with
less features. There are tons of threads on the postgresql list where
these two databases (and others) are compared.
That's why my heart beats faster when I hear the name MySQL - I'm sorry,
I'll try to suppress these sidekicks.

> > Is the entire point of your comment that we can safely use
> >
> > BEGIN;
> >
> > .. whatever dbmail does...
> >
> > COMMIT;
> >
> > regardless of whether or not we know if the host database actually supports
> > these keywords? You could have just said that.
[..]
> But yes, you have perfectly summarised the point of my comment.

As a quick speed up that's a good solution yes. But if it's the right
permanent solution?


--
MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)
RE: some speed tests [ In reply to ]
Download and install aspseek-1.2.10 from http://www.aspseek.org
Run this against mysql 3.23.49

Run it for a few days, indexing about 1million urls,
Then keep reindexing them and watch your database corrupt...

Regards,

John

-----Original Message-----
From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
On Behalf Of Chris Nolan
Sent: Sunday, February 15, 2004 6:37 PM
To: dbmail-dev@dbmail.org
Subject: RE: [Dbmail-dev] some speed tests


Table corruptions?

I'd be very interested to hear the story around your grief, if for
nothing else, for reference purposes. :-)

Regards,

Chris

On Sun, 2004-02-15 at 17:19, John Hansen wrote:
> Personally, I recent the statement; 'Considering MySQL has proven
> itself time and time again in terms of reliability, licencing
> flexibility and performance, such statements are baseless regardless
> of what arguments you make for which features'; based on experience,
> mysql proved that it was not reliable, tho performance was top of the
> line.
>
> In my opinion, if you're concerned with reliability, mysql should not
> be an option. I had daily table corruptions, and thus one of my main
> tasks was to run mysql repair jobs regularly, just to keep everything
> running.
>
> Granted, postgresql has it's own problems, mainly with performance and

> lack of a proper master / slave replication model, but that's a small
> price to pay for reliability.
>
> Just my $0.02 worth.
>
> Regards,
>
> John Hansen
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 4:42 PM
> To: dbmail-dev@dbmail.org
> Subject: Re: [Dbmail-dev] some speed tests
>
>
> Dear Aaron,
>
> My comments are also inline. :-)
>
> On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > Comments inline...
> >
> > Chris Nolan <chris@imi.com.au> said:
> >
> > > Forgive the bluntness of the statement, but why is anyone even
> > > worrying
> > > about transactions as they relate to MySQL???
> >
> > Because we are currently structured to have a single set of
> > mid-level
> > database operations that are translated into specific low-level
> > database function calls. The upside is that there isn't any database

> > specific handling outside of these mid-level functions. The downside

> > is that we need to make sure that function calls for certain
features
> > are consistent with analogous features in each database and that
they
> > carry with them enough information to make the appropriate low-level

> > call. A good example is the last inserted id number. In MySQL, you
> > only need the database connection identifier to get this. In
> > PostgreSQL, yo need both the database connection and the table
> > identifier. For DBMail to have a mid-level function that worked for
> > both, we'd have to make sure that it took both arguments and used
them
>
> > as needed for whichever database's low-level calls were being used.
> > (Those were off the top of my head, so they may be incorrect, but
they
>
> > do illustrate my point.)
>
> I've looked through the DBMail source code on a few occassions and
> even released a dodgy tool to convert Cyrus mailboxes to DBMail a
> while ago. The fact that I can use the calls present in the DB modules

> and avoid having to write queries by hand for each different DB is an
> excellent feature of the DBMail API. I have nothing negative to say
> here at all.
>
> >
> > > COMMIT in MySQL is passed to the table handler. In the case of
> > > MyISAM
> > > tables, the handler disregards the statement. For InnoDB and BDB
> tables,
> > > COMMIT acts as it does in PostgreSQL.
> >
> > So then we have THREE different configurations to consider, and need
> > to be sure to design the mid-level interface appropriately.
>
> My statement was meant to say that the MyISAM table handler will just
> disregard BEGIN and COMMIT statements, just as it parses but ignores
> CHECK and FOREIGN KEY constraints in table creation, thus illustrating

> that you wouldn't need to add a mysql-with-transactions directory to
> the source tree.
>
> >
> > > So various people in this thread implying that MySQL isn't really
> > > a database need to do some more reading.
> >
> > If you're referring to my suggestion that the transaction functions
> > are a noop for MySQL, then you're reading too deeply... I just
didn't
> > realize that InnoDB would handle transactions entirely normally.
>
> My apologies, I was not referring to you at all! Yourself and your
> collegues have provided the world with a very, very funky mail
> repository!
>
> I was referring to someone who said in the body of their message
> posted to another branch of this thread "the supposed database mysql".

> Considering MySQL has proven itself time and time again in terms of
> reliability, licencing flexibility and performance, such statements
> are baseless regardless of what arguments you make for which features.

> For example, Visual FoxPro's backend is technically a database but
> provides nothing in the way of a privilege system (you need write
> access to record locking, so filesystem-level controls are no good
> either).
>
> >
> > > In summary, just encapsulate everything in transaction blocks and
> > > the
> > > underlaying database will act appropriately.
> >
> > Right. That's what we're talking about.
> >
> > Is the entire point of your comment that we can safely use
> >
> > BEGIN;
> >
> > .. whatever dbmail does...
> >
> > COMMIT;
> >
> > regardless of whether or not we know if the host database actually
> > supports these keywords? You could have just said that.
>
> I could have, but then I would have risked putting an end to the
> thread.
> :-)
>
> Please accept my apology - I never meant to offend yourself or anyone
> else who contributes to DBMail. It seems my reading of the post I
> mentioned above caused me to be a bit heavier than I should have been.
>
> But yes, you have perfectly summarised the point of my comment.
> >
> > > Regards,
> > >
> > > Chris
> > >
> > > Aaron Stone wrote:
> > >
> > > >I don't even know where to begin in terms of designing the
> > > >delivery
>
> > > >chain around transactions. Could we do it as simply as adding
> > > >functions...
> > > >
> > > > void db_begin_transaction(void);
> > > > void db_flush_transaction(void); (or db_commit_...?)
> > > >
> > > >and then calling these functions before and after each major
> > > >section of database code? For the delivery chain, we could do it
> > > >inside of insert_message(). For dbmail-smtp, this basically means

> > > >that the execution of the whole program is within one
transaction.
> > > >For dbmail-lmtpd, it means that each message is delivered within
a
> > > >transaction but the miscellaneous queries before the main message

> > > >delivery chain are not transacted. For MySQL, these functions
would
>
> > > >be noops.
> > > >
> > > >Thing that might work?
> > > >
> > > >Aaron
> > > >
> > > >
> > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > >
> > > >
> > > >
> > > >>Hi Aaron,
> > > >>
> > > >>
> > > >>
> > > >>>Do you have any way of narrowing this down to specific queries
> > > >>>that are taking the longest and/or are being executed the most?

> > > >>>That would identify which low-level database functions are
being
> > > >>>called, then we can just trace our way up the call chain to see

> > > >>>who's misbehaving or acting on a flawed design. Also, if you
> > > >>>could run similar tests against the latest 1.2, it would help
to
> > > >>>give a frame of reference, particularly for my delivery chain
> > > >>>design.
> > > >>>
> > > >>>
> > > >>That's simple: the main design flaw (actually that's no design
> > > >>flaw I think, that's because the so called database MySQL
couldn't
>
> > > >>do transactions in the past) is that dbmail doesn't use
> > > >>transaction.
> > > >>
> > > >>Because of that AutoCommit is used and whenever dbmail does
> > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and
that
>
> > > >>is terribly slow. To ensure the Durability in ACID the database
> > > >>has to fflush() every transaction to stable storage! That's why
> > > >>there is only one solution: we have to use transaction.
> > > >>
> > > >>With transactions we could remove the integrity checks of
> > > >>dbmail-maintenance too, because the database guarantees
integrity.
> > > >>
> > > >>Anyway, I did a trace of all SQL queries when a mail is copied
> > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db
operations
> > > >>to insert one mail?).
> > > >>
> > > >>When searching for the sequential scan I found something quite
> > > >>interesting in the docs: the planer decides for every scan if a
> > > >>seqScan is cheaper that an index scan, and does a seqScan even
if
> > > >>an index
> > > >>exists:
> > > >>
> > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> owner_idnr=2;
> > > >> QUERY PLAN
> > > >>---------------------------------------------------------
> > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > >> Filter: (owner_idnr = 2)
> > > >>(2 rows)
> > > >>
> > > >>The table has an index on owner_idnr.
> > > >>
> > > >>So I should repeat this test with a database with several
> > > >>hundred
> > > >>to thousand user, several dozen mailboxes for each user and
> > > >>several dozen mails in each mailbox to find out if all required
> > > >>indizes are there. Did anyone write a script to create such a
> > > >>database?
> > > >>
> > > >>But I found a strange query:
> > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> > > >>owner_idnr = '2' mailbox_idnr is the primary key so that could
be
> > > >>optimized to: SELECT 4
> > > >>;-)
> > > >>
> > > >>
> > > >>I don't have a 1.2 installation, I'm sorry.
> > > >>
> > > >>
> > > >>--
> > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> (95702B3B)
> > > >>_______________________________________________
> > > >>Dbmail-dev mailing list
> > > >>Dbmail-dev@dbmail.org
> > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > >>
> > > >>
> > > >>
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > >
> >
> >
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
Re: some speed tests [ In reply to ]
On Sun, 2004-02-15 at 20:36, Thomas Mueller wrote:
> Hi Chris,
>
> > > > COMMIT in MySQL is passed to the table handler. In the case of MyISAM
> > > > tables, the handler disregards the statement. For InnoDB and BDB tables,
> > > > COMMIT acts as it does in PostgreSQL.
> > >
> > > So then we have THREE different configurations to consider, and need to be
> > > sure to design the mid-level interface appropriately.
> >
> > My statement was meant to say that the MyISAM table handler will just
> > disregard BEGIN and COMMIT statements, just as it parses but ignores
> > CHECK and FOREIGN KEY constraints in table creation, thus illustrating
> > that you wouldn't need to add a mysql-with-transactions directory to the
> > source tree.
>
> But don't we have to differentiate between a database with and a
> database without transactions?
> If we have transactions we can simply do all the database stuff, if
> anything fails we do the rollback and bail out. We still have a
> perfectly consistent database.
> Without transactions we have to remember every successful database
> modification we made and have to revert that.

That would be very difficult in the case of UPDATEs though.
Additionally, with MyISAM tables there is a problem with race hazards.
For example:

SELECT * FROM table WHERE id = 3;
UPDATE table SET attr = 'thing' WHERE id = 3;
// Error occurs
UPDATE table SET attr = $oldvalue WHERE id = 3;

In the above statement, you have no way of ensuring that no one else has
buggered around with the row we're interested in. As DBMail's queries
seem to be primarily INSERTs and SELECTs though, this isn't as big a
problem as using LOCK TABLE statements would not result in a noticable
performance hit.


>
> That would complicate everything a lot. Is it possible to drop
> MySQL-without-transactions support?
>
> > I was referring to someone who said in the body of their message posted
> > to another branch of this thread "the supposed database mysql".
> > Considering MySQL has proven itself time and time again in terms of
> > reliability, licencing flexibility and performance, such statements are
> > baseless regardless of what arguments you make for which features.
>
> That was me, I'm sorry! I didn't want to start a MySQL bashing thread. I
> had lot of problems with MySQL in the past, that's why I said that. In my
> eyes a database should support some basic things to have the right to
> carry the name SQL database. Among these are transactions, foreign key
> constraints, stored procedures, trigger, subselects. MySQL had not even
> one of these features (today they probably have most of these features)
> leading to poorly designed software, where the application had to do lot
> of work the database should do (a good example in dbmail are the quotas
> - the database could and should calculate that on its own).
> The case insensitiveness of MySQL is another point where I had lot of
> problems in the past. I only know two case insensitive databases: MySQL
> and MS Access. Every other database is case sensitive (Postgres, Oracle,
> MS SQL, what-ever).
> And one should never forget the dual license of MySQL.
> On the other hand people tell me that MySQL is faster than Postgres -
> that's simply wrong. It's only faster if you use the database type with
> less features. There are tons of threads on the postgresql list where
> these two databases (and others) are compared.
> That's why my heart beats faster when I hear the name MySQL - I'm sorry,
> I'll try to suppress these sidekicks.
>
The reason I react to statements against my favourite database, MySQL,
is simply due to the things I have done with it and things I have seen
others do with it.

Today, MySQL does have transactions. foreign key constraints and
subselect and stored procedure functionality are in beta.

MySQL's case insensitivity is a result of having to compensate for
Windows' terrible deficencies in some areas and a design choice in
others (which is actually a pretty stupid design choice if you ask me -
case-sensitive string comparisons are always faster).

Regarding the comparisons you speak of, could you point me to some of
them? I'm yet to see a recent, decent comparison between the two. I'd
love to see PostgreSQL 7.4.1 vs MySQL 4.0.17 / 4.1.1 on Linux 2.6.x or
FreeBSD 5.2.x .

I find your comment regarding the dual licence very interesting. If
you'd like to continue this (hopefully friendly) discussion, please
write to me off the list. :-)


> > > Is the entire point of your comment that we can safely use
> > >
> > > BEGIN;
> > >
> > > .. whatever dbmail does...
> > >
> > > COMMIT;
> > >
> > > regardless of whether or not we know if the host database actually supports
> > > these keywords? You could have just said that.
> [..]
> > But yes, you have perfectly summarised the point of my comment.
>
> As a quick speed up that's a good solution yes. But if it's the right
> permanent solution?

Personally, I think it is. MyISAM in MySQL is a great table type for lots and lots of INSERT statements OR lots and lots
of SELECT statements but not both (quoting Jeremy Zawdony from Yahoo!).
Anyone serious about performance in a case like DBMail would probably
use InnoDB anyway. The fact that it is now part of the standard MySQL
distro gives little reason to not use MyISAM tables except for their
ability to perform FULLTEXT searches.

Regards,

Chris
RE: some speed tests [ In reply to ]
Now you've peaked my curiosity!

Do you have any additional details regarding table types used and
whether or not the same thing occurs in the 4.0.x series?

Regards,

Chris

On Sun, 2004-02-15 at 21:14, John Hansen wrote:
> Download and install aspseek-1.2.10 from http://www.aspseek.org
> Run this against mysql 3.23.49
>
> Run it for a few days, indexing about 1million urls,
> Then keep reindexing them and watch your database corrupt...
>
> Regards,
>
> John
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 6:37 PM
> To: dbmail-dev@dbmail.org
> Subject: RE: [Dbmail-dev] some speed tests
>
>
> Table corruptions?
>
> I'd be very interested to hear the story around your grief, if for
> nothing else, for reference purposes. :-)
>
> Regards,
>
> Chris
>
> On Sun, 2004-02-15 at 17:19, John Hansen wrote:
> > Personally, I recent the statement; 'Considering MySQL has proven
> > itself time and time again in terms of reliability, licencing
> > flexibility and performance, such statements are baseless regardless
> > of what arguments you make for which features'; based on experience,
> > mysql proved that it was not reliable, tho performance was top of the
> > line.
> >
> > In my opinion, if you're concerned with reliability, mysql should not
> > be an option. I had daily table corruptions, and thus one of my main
> > tasks was to run mysql repair jobs regularly, just to keep everything
> > running.
> >
> > Granted, postgresql has it's own problems, mainly with performance and
>
> > lack of a proper master / slave replication model, but that's a small
> > price to pay for reliability.
> >
> > Just my $0.02 worth.
> >
> > Regards,
> >
> > John Hansen
> >
> > -----Original Message-----
> > From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> > On Behalf Of Chris Nolan
> > Sent: Sunday, February 15, 2004 4:42 PM
> > To: dbmail-dev@dbmail.org
> > Subject: Re: [Dbmail-dev] some speed tests
> >
> >
> > Dear Aaron,
> >
> > My comments are also inline. :-)
> >
> > On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > > Comments inline...
> > >
> > > Chris Nolan <chris@imi.com.au> said:
> > >
> > > > Forgive the bluntness of the statement, but why is anyone even
> > > > worrying
> > > > about transactions as they relate to MySQL???
> > >
> > > Because we are currently structured to have a single set of
> > > mid-level
> > > database operations that are translated into specific low-level
> > > database function calls. The upside is that there isn't any database
>
> > > specific handling outside of these mid-level functions. The downside
>
> > > is that we need to make sure that function calls for certain
> features
> > > are consistent with analogous features in each database and that
> they
> > > carry with them enough information to make the appropriate low-level
>
> > > call. A good example is the last inserted id number. In MySQL, you
> > > only need the database connection identifier to get this. In
> > > PostgreSQL, yo need both the database connection and the table
> > > identifier. For DBMail to have a mid-level function that worked for
> > > both, we'd have to make sure that it took both arguments and used
> them
> >
> > > as needed for whichever database's low-level calls were being used.
> > > (Those were off the top of my head, so they may be incorrect, but
> they
> >
> > > do illustrate my point.)
> >
> > I've looked through the DBMail source code on a few occassions and
> > even released a dodgy tool to convert Cyrus mailboxes to DBMail a
> > while ago. The fact that I can use the calls present in the DB modules
>
> > and avoid having to write queries by hand for each different DB is an
> > excellent feature of the DBMail API. I have nothing negative to say
> > here at all.
> >
> > >
> > > > COMMIT in MySQL is passed to the table handler. In the case of
> > > > MyISAM
> > > > tables, the handler disregards the statement. For InnoDB and BDB
> > tables,
> > > > COMMIT acts as it does in PostgreSQL.
> > >
> > > So then we have THREE different configurations to consider, and need
> > > to be sure to design the mid-level interface appropriately.
> >
> > My statement was meant to say that the MyISAM table handler will just
> > disregard BEGIN and COMMIT statements, just as it parses but ignores
> > CHECK and FOREIGN KEY constraints in table creation, thus illustrating
>
> > that you wouldn't need to add a mysql-with-transactions directory to
> > the source tree.
> >
> > >
> > > > So various people in this thread implying that MySQL isn't really
> > > > a database need to do some more reading.
> > >
> > > If you're referring to my suggestion that the transaction functions
> > > are a noop for MySQL, then you're reading too deeply... I just
> didn't
> > > realize that InnoDB would handle transactions entirely normally.
> >
> > My apologies, I was not referring to you at all! Yourself and your
> > collegues have provided the world with a very, very funky mail
> > repository!
> >
> > I was referring to someone who said in the body of their message
> > posted to another branch of this thread "the supposed database mysql".
>
> > Considering MySQL has proven itself time and time again in terms of
> > reliability, licencing flexibility and performance, such statements
> > are baseless regardless of what arguments you make for which features.
>
> > For example, Visual FoxPro's backend is technically a database but
> > provides nothing in the way of a privilege system (you need write
> > access to record locking, so filesystem-level controls are no good
> > either).
> >
> > >
> > > > In summary, just encapsulate everything in transaction blocks and
> > > > the
> > > > underlaying database will act appropriately.
> > >
> > > Right. That's what we're talking about.
> > >
> > > Is the entire point of your comment that we can safely use
> > >
> > > BEGIN;
> > >
> > > .. whatever dbmail does...
> > >
> > > COMMIT;
> > >
> > > regardless of whether or not we know if the host database actually
> > > supports these keywords? You could have just said that.
> >
> > I could have, but then I would have risked putting an end to the
> > thread.
> > :-)
> >
> > Please accept my apology - I never meant to offend yourself or anyone
> > else who contributes to DBMail. It seems my reading of the post I
> > mentioned above caused me to be a bit heavier than I should have been.
> >
> > But yes, you have perfectly summarised the point of my comment.
> > >
> > > > Regards,
> > > >
> > > > Chris
> > > >
> > > > Aaron Stone wrote:
> > > >
> > > > >I don't even know where to begin in terms of designing the
> > > > >delivery
> >
> > > > >chain around transactions. Could we do it as simply as adding
> > > > >functions...
> > > > >
> > > > > void db_begin_transaction(void);
> > > > > void db_flush_transaction(void); (or db_commit_...?)
> > > > >
> > > > >and then calling these functions before and after each major
> > > > >section of database code? For the delivery chain, we could do it
> > > > >inside of insert_message(). For dbmail-smtp, this basically means
>
> > > > >that the execution of the whole program is within one
> transaction.
> > > > >For dbmail-lmtpd, it means that each message is delivered within
> a
> > > > >transaction but the miscellaneous queries before the main message
>
> > > > >delivery chain are not transacted. For MySQL, these functions
> would
> >
> > > > >be noops.
> > > > >
> > > > >Thing that might work?
> > > > >
> > > > >Aaron
> > > > >
> > > > >
> > > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > > >
> > > > >
> > > > >
> > > > >>Hi Aaron,
> > > > >>
> > > > >>
> > > > >>
> > > > >>>Do you have any way of narrowing this down to specific queries
> > > > >>>that are taking the longest and/or are being executed the most?
>
> > > > >>>That would identify which low-level database functions are
> being
> > > > >>>called, then we can just trace our way up the call chain to see
>
> > > > >>>who's misbehaving or acting on a flawed design. Also, if you
> > > > >>>could run similar tests against the latest 1.2, it would help
> to
> > > > >>>give a frame of reference, particularly for my delivery chain
> > > > >>>design.
> > > > >>>
> > > > >>>
> > > > >>That's simple: the main design flaw (actually that's no design
> > > > >>flaw I think, that's because the so called database MySQL
> couldn't
> >
> > > > >>do transactions in the past) is that dbmail doesn't use
> > > > >>transaction.
> > > > >>
> > > > >>Because of that AutoCommit is used and whenever dbmail does
> > > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and
> that
> >
> > > > >>is terribly slow. To ensure the Durability in ACID the database
> > > > >>has to fflush() every transaction to stable storage! That's why
> > > > >>there is only one solution: we have to use transaction.
> > > > >>
> > > > >>With transactions we could remove the integrity checks of
> > > > >>dbmail-maintenance too, because the database guarantees
> integrity.
> > > > >>
> > > > >>Anyway, I did a trace of all SQL queries when a mail is copied
> > > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db
> operations
> > > > >>to insert one mail?).
> > > > >>
> > > > >>When searching for the sequential scan I found something quite
> > > > >>interesting in the docs: the planer decides for every scan if a
> > > > >>seqScan is cheaper that an index scan, and does a seqScan even
> if
> > > > >>an index
> > > > >>exists:
> > > > >>
> > > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> > owner_idnr=2;
> > > > >> QUERY PLAN
> > > > >>---------------------------------------------------------
> > > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > > >> Filter: (owner_idnr = 2)
> > > > >>(2 rows)
> > > > >>
> > > > >>The table has an index on owner_idnr.
> > > > >>
> > > > >>So I should repeat this test with a database with several
> > > > >>hundred
> > > > >>to thousand user, several dozen mailboxes for each user and
> > > > >>several dozen mails in each mailbox to find out if all required
> > > > >>indizes are there. Did anyone write a script to create such a
> > > > >>database?
> > > > >>
> > > > >>But I found a strange query:
> > > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4' AND
> > > > >>owner_idnr = '2' mailbox_idnr is the primary key so that could
> be
> > > > >>optimized to: SELECT 4
> > > > >>;-)
> > > > >>
> > > > >>
> > > > >>I don't have a 1.2 installation, I'm sorry.
> > > > >>
> > > > >>
> > > > >>--
> > > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> > (95702B3B)
> > > > >>_______________________________________________
> > > > >>Dbmail-dev mailing list
> > > > >>Dbmail-dev@dbmail.org
> > > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > > >>
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Dbmail-dev mailing list
> > > > Dbmail-dev@dbmail.org
> > > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > >
> > >
> > >
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
RE: some speed tests [ In reply to ]
We tried both innodb and myisam with the same results, except of course
performance suffered when using the innodb type. However both types
resulted in corrupted tables after a few days starting from scratch.

Following that corruption happened daily.

I abandoned mysql early last year as I rewrote the application to use
postgresql and haven't had any problems since.

Other reasons I abandoned mysql was that to my dismay I discovered that
statements such as select .. From table for update; is allowed. This
should definately throw an error since for update is not supported and
thus would lead to inconsistent data.

I have not attempted to use mysql 4.

-----Original Message-----
From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
On Behalf Of Chris Nolan
Sent: Sunday, February 15, 2004 9:31 PM
To: dbmail-dev@dbmail.org
Subject: RE: [Dbmail-dev] some speed tests


Now you've peaked my curiosity!

Do you have any additional details regarding table types used and
whether or not the same thing occurs in the 4.0.x series?

Regards,

Chris

On Sun, 2004-02-15 at 21:14, John Hansen wrote:
> Download and install aspseek-1.2.10 from http://www.aspseek.org Run
> this against mysql 3.23.49
>
> Run it for a few days, indexing about 1million urls,
> Then keep reindexing them and watch your database corrupt...
>
> Regards,
>
> John
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 6:37 PM
> To: dbmail-dev@dbmail.org
> Subject: RE: [Dbmail-dev] some speed tests
>
>
> Table corruptions?
>
> I'd be very interested to hear the story around your grief, if for
> nothing else, for reference purposes. :-)
>
> Regards,
>
> Chris
>
> On Sun, 2004-02-15 at 17:19, John Hansen wrote:
> > Personally, I recent the statement; 'Considering MySQL has proven
> > itself time and time again in terms of reliability, licencing
> > flexibility and performance, such statements are baseless regardless

> > of what arguments you make for which features'; based on experience,

> > mysql proved that it was not reliable, tho performance was top of
the
> > line.
> >
> > In my opinion, if you're concerned with reliability, mysql should
> > not
> > be an option. I had daily table corruptions, and thus one of my main

> > tasks was to run mysql repair jobs regularly, just to keep
everything
> > running.
> >
> > Granted, postgresql has it's own problems, mainly with performance
> > and
>
> > lack of a proper master / slave replication model, but that's a
> > small
> > price to pay for reliability.
> >
> > Just my $0.02 worth.
> >
> > Regards,
> >
> > John Hansen
> >
> > -----Original Message-----
> > From: dbmail-dev-admin@dbmail.org
> > [mailto:dbmail-dev-admin@dbmail.org]
> > On Behalf Of Chris Nolan
> > Sent: Sunday, February 15, 2004 4:42 PM
> > To: dbmail-dev@dbmail.org
> > Subject: Re: [Dbmail-dev] some speed tests
> >
> >
> > Dear Aaron,
> >
> > My comments are also inline. :-)
> >
> > On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > > Comments inline...
> > >
> > > Chris Nolan <chris@imi.com.au> said:
> > >
> > > > Forgive the bluntness of the statement, but why is anyone even
> > > > worrying about transactions as they relate to MySQL???
> > >
> > > Because we are currently structured to have a single set of
> > > mid-level
> > > database operations that are translated into specific low-level
> > > database function calls. The upside is that there isn't any
database
>
> > > specific handling outside of these mid-level functions. The
> > > downside
>
> > > is that we need to make sure that function calls for certain
> features
> > > are consistent with analogous features in each database and that
> they
> > > carry with them enough information to make the appropriate
> > > low-level
>
> > > call. A good example is the last inserted id number. In MySQL, you
> > > only need the database connection identifier to get this. In
> > > PostgreSQL, yo need both the database connection and the table
> > > identifier. For DBMail to have a mid-level function that worked
for
> > > both, we'd have to make sure that it took both arguments and used
> them
> >
> > > as needed for whichever database's low-level calls were being
> > > used. (Those were off the top of my head, so they may be
> > > incorrect, but
> they
> >
> > > do illustrate my point.)
> >
> > I've looked through the DBMail source code on a few occassions and
> > even released a dodgy tool to convert Cyrus mailboxes to DBMail a
> > while ago. The fact that I can use the calls present in the DB
modules
>
> > and avoid having to write queries by hand for each different DB is
> > an
> > excellent feature of the DBMail API. I have nothing negative to say
> > here at all.
> >
> > >
> > > > COMMIT in MySQL is passed to the table handler. In the case of
> > > > MyISAM tables, the handler disregards the statement. For InnoDB
> > > > and BDB
> > tables,
> > > > COMMIT acts as it does in PostgreSQL.
> > >
> > > So then we have THREE different configurations to consider, and
> > > need to be sure to design the mid-level interface appropriately.
> >
> > My statement was meant to say that the MyISAM table handler will
> > just
> > disregard BEGIN and COMMIT statements, just as it parses but ignores

> > CHECK and FOREIGN KEY constraints in table creation, thus
illustrating
>
> > that you wouldn't need to add a mysql-with-transactions directory to
> > the source tree.
> >
> > >
> > > > So various people in this thread implying that MySQL isn't
> > > > really
> > > > a database need to do some more reading.
> > >
> > > If you're referring to my suggestion that the transaction
> > > functions are a noop for MySQL, then you're reading too deeply...
> > > I just
> didn't
> > > realize that InnoDB would handle transactions entirely normally.
> >
> > My apologies, I was not referring to you at all! Yourself and your
> > collegues have provided the world with a very, very funky mail
> > repository!
> >
> > I was referring to someone who said in the body of their message
> > posted to another branch of this thread "the supposed database
mysql".
>
> > Considering MySQL has proven itself time and time again in terms of
> > reliability, licencing flexibility and performance, such statements
> > are baseless regardless of what arguments you make for which
features.
>
> > For example, Visual FoxPro's backend is technically a database but
> > provides nothing in the way of a privilege system (you need write
> > access to record locking, so filesystem-level controls are no good
> > either).
> >
> > >
> > > > In summary, just encapsulate everything in transaction blocks
> > > > and the underlaying database will act appropriately.
> > >
> > > Right. That's what we're talking about.
> > >
> > > Is the entire point of your comment that we can safely use
> > >
> > > BEGIN;
> > >
> > > .. whatever dbmail does...
> > >
> > > COMMIT;
> > >
> > > regardless of whether or not we know if the host database actually

> > > supports these keywords? You could have just said that.
> >
> > I could have, but then I would have risked putting an end to the
> > thread.
> > :-)
> >
> > Please accept my apology - I never meant to offend yourself or
> > anyone
> > else who contributes to DBMail. It seems my reading of the post I
> > mentioned above caused me to be a bit heavier than I should have
been.
> >
> > But yes, you have perfectly summarised the point of my comment.
> > >
> > > > Regards,
> > > >
> > > > Chris
> > > >
> > > > Aaron Stone wrote:
> > > >
> > > > >I don't even know where to begin in terms of designing the
> > > > >delivery
> >
> > > > >chain around transactions. Could we do it as simply as adding
> > > > >functions...
> > > > >
> > > > > void db_begin_transaction(void);
> > > > > void db_flush_transaction(void); (or db_commit_...?)
> > > > >
> > > > >and then calling these functions before and after each major
> > > > >section of database code? For the delivery chain, we could do
> > > > >it inside of insert_message(). For dbmail-smtp, this basically
> > > > >means
>
> > > > >that the execution of the whole program is within one
> transaction.
> > > > >For dbmail-lmtpd, it means that each message is delivered
> > > > >within
> a
> > > > >transaction but the miscellaneous queries before the main
> > > > >message
>
> > > > >delivery chain are not transacted. For MySQL, these functions
> would
> >
> > > > >be noops.
> > > > >
> > > > >Thing that might work?
> > > > >
> > > > >Aaron
> > > > >
> > > > >
> > > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > > >
> > > > >
> > > > >
> > > > >>Hi Aaron,
> > > > >>
> > > > >>
> > > > >>
> > > > >>>Do you have any way of narrowing this down to specific
> > > > >>>queries that are taking the longest and/or are being executed

> > > > >>>the most?
>
> > > > >>>That would identify which low-level database functions are
> being
> > > > >>>called, then we can just trace our way up the call chain to
> > > > >>>see
>
> > > > >>>who's misbehaving or acting on a flawed design. Also, if you
> > > > >>>could run similar tests against the latest 1.2, it would help
> to
> > > > >>>give a frame of reference, particularly for my delivery chain
> > > > >>>design.
> > > > >>>
> > > > >>>
> > > > >>That's simple: the main design flaw (actually that's no design

> > > > >>flaw I think, that's because the so called database MySQL
> couldn't
> >
> > > > >>do transactions in the past) is that dbmail doesn't use
> > > > >>transaction.
> > > > >>
> > > > >>Because of that AutoCommit is used and whenever dbmail does
> > > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and
> that
> >
> > > > >>is terribly slow. To ensure the Durability in ACID the
> > > > >>database has to fflush() every transaction to stable storage!
> > > > >>That's why there is only one solution: we have to use
> > > > >>transaction.
> > > > >>
> > > > >>With transactions we could remove the integrity checks of
> > > > >>dbmail-maintenance too, because the database guarantees
> integrity.
> > > > >>
> > > > >>Anyway, I did a trace of all SQL queries when a mail is copied

> > > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db
> operations
> > > > >>to insert one mail?).
> > > > >>
> > > > >>When searching for the sequential scan I found something quite

> > > > >>interesting in the docs: the planer decides for every scan if
> > > > >>a seqScan is cheaper that an index scan, and does a seqScan
> > > > >>even
> if
> > > > >>an index
> > > > >>exists:
> > > > >>
> > > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> > owner_idnr=2;
> > > > >> QUERY PLAN
> > > > >>---------------------------------------------------------
> > > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > > >> Filter: (owner_idnr = 2)
> > > > >>(2 rows)
> > > > >>
> > > > >>The table has an index on owner_idnr.
> > > > >>
> > > > >>So I should repeat this test with a database with several
> > > > >>hundred
> > > > >>to thousand user, several dozen mailboxes for each user and
> > > > >>several dozen mails in each mailbox to find out if all
required
> > > > >>indizes are there. Did anyone write a script to create such a
> > > > >>database?
> > > > >>
> > > > >>But I found a strange query:
> > > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4'
> > > > >>AND owner_idnr = '2' mailbox_idnr is the primary key so that
> > > > >>could
> be
> > > > >>optimized to: SELECT 4
> > > > >>;-)
> > > > >>
> > > > >>
> > > > >>I don't have a 1.2 installation, I'm sorry.
> > > > >>
> > > > >>
> > > > >>--
> > > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> > (95702B3B)
> > > > >>_______________________________________________
> > > > >>Dbmail-dev mailing list
> > > > >>Dbmail-dev@dbmail.org
> > > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > > >>
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > Dbmail-dev mailing list
> > > > Dbmail-dev@dbmail.org
> > > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > >
> > >
> > >
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
RE: some speed tests [ In reply to ]
Hmm...does ASPSeek utilise transactions at all?

The fact that it happened with both points to some glaring bug inside
MySQL's upper layers or a hardware problem, but since PostgreSQL was
fine (I'm assuming on the same hardware), that's a totally discounted
possibility.

Did anyone on the MySQL mailing list give you any joy?

Regards,

Chris

On Sun, 2004-02-15 at 21:40, John Hansen wrote:
> We tried both innodb and myisam with the same results, except of course
> performance suffered when using the innodb type. However both types
> resulted in corrupted tables after a few days starting from scratch.
>
> Following that corruption happened daily.
>
> I abandoned mysql early last year as I rewrote the application to use
> postgresql and haven't had any problems since.
>
> Other reasons I abandoned mysql was that to my dismay I discovered that
> statements such as select .. From table for update; is allowed. This
> should definately throw an error since for update is not supported and
> thus would lead to inconsistent data.
>
> I have not attempted to use mysql 4.
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 9:31 PM
> To: dbmail-dev@dbmail.org
> Subject: RE: [Dbmail-dev] some speed tests
>
>
> Now you've peaked my curiosity!
>
> Do you have any additional details regarding table types used and
> whether or not the same thing occurs in the 4.0.x series?
>
> Regards,
>
> Chris
>
> On Sun, 2004-02-15 at 21:14, John Hansen wrote:
> > Download and install aspseek-1.2.10 from http://www.aspseek.org Run
> > this against mysql 3.23.49
> >
> > Run it for a few days, indexing about 1million urls,
> > Then keep reindexing them and watch your database corrupt...
> >
> > Regards,
> >
> > John
> >
> > -----Original Message-----
> > From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> > On Behalf Of Chris Nolan
> > Sent: Sunday, February 15, 2004 6:37 PM
> > To: dbmail-dev@dbmail.org
> > Subject: RE: [Dbmail-dev] some speed tests
> >
> >
> > Table corruptions?
> >
> > I'd be very interested to hear the story around your grief, if for
> > nothing else, for reference purposes. :-)
> >
> > Regards,
> >
> > Chris
> >
> > On Sun, 2004-02-15 at 17:19, John Hansen wrote:
> > > Personally, I recent the statement; 'Considering MySQL has proven
> > > itself time and time again in terms of reliability, licencing
> > > flexibility and performance, such statements are baseless regardless
>
> > > of what arguments you make for which features'; based on experience,
>
> > > mysql proved that it was not reliable, tho performance was top of
> the
> > > line.
> > >
> > > In my opinion, if you're concerned with reliability, mysql should
> > > not
> > > be an option. I had daily table corruptions, and thus one of my main
>
> > > tasks was to run mysql repair jobs regularly, just to keep
> everything
> > > running.
> > >
> > > Granted, postgresql has it's own problems, mainly with performance
> > > and
> >
> > > lack of a proper master / slave replication model, but that's a
> > > small
> > > price to pay for reliability.
> > >
> > > Just my $0.02 worth.
> > >
> > > Regards,
> > >
> > > John Hansen
> > >
> > > -----Original Message-----
> > > From: dbmail-dev-admin@dbmail.org
> > > [mailto:dbmail-dev-admin@dbmail.org]
> > > On Behalf Of Chris Nolan
> > > Sent: Sunday, February 15, 2004 4:42 PM
> > > To: dbmail-dev@dbmail.org
> > > Subject: Re: [Dbmail-dev] some speed tests
> > >
> > >
> > > Dear Aaron,
> > >
> > > My comments are also inline. :-)
> > >
> > > On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > > > Comments inline...
> > > >
> > > > Chris Nolan <chris@imi.com.au> said:
> > > >
> > > > > Forgive the bluntness of the statement, but why is anyone even
> > > > > worrying about transactions as they relate to MySQL???
> > > >
> > > > Because we are currently structured to have a single set of
> > > > mid-level
> > > > database operations that are translated into specific low-level
> > > > database function calls. The upside is that there isn't any
> database
> >
> > > > specific handling outside of these mid-level functions. The
> > > > downside
> >
> > > > is that we need to make sure that function calls for certain
> > features
> > > > are consistent with analogous features in each database and that
> > they
> > > > carry with them enough information to make the appropriate
> > > > low-level
> >
> > > > call. A good example is the last inserted id number. In MySQL, you
> > > > only need the database connection identifier to get this. In
> > > > PostgreSQL, yo need both the database connection and the table
> > > > identifier. For DBMail to have a mid-level function that worked
> for
> > > > both, we'd have to make sure that it took both arguments and used
> > them
> > >
> > > > as needed for whichever database's low-level calls were being
> > > > used. (Those were off the top of my head, so they may be
> > > > incorrect, but
> > they
> > >
> > > > do illustrate my point.)
> > >
> > > I've looked through the DBMail source code on a few occassions and
> > > even released a dodgy tool to convert Cyrus mailboxes to DBMail a
> > > while ago. The fact that I can use the calls present in the DB
> modules
> >
> > > and avoid having to write queries by hand for each different DB is
> > > an
> > > excellent feature of the DBMail API. I have nothing negative to say
> > > here at all.
> > >
> > > >
> > > > > COMMIT in MySQL is passed to the table handler. In the case of
> > > > > MyISAM tables, the handler disregards the statement. For InnoDB
> > > > > and BDB
> > > tables,
> > > > > COMMIT acts as it does in PostgreSQL.
> > > >
> > > > So then we have THREE different configurations to consider, and
> > > > need to be sure to design the mid-level interface appropriately.
> > >
> > > My statement was meant to say that the MyISAM table handler will
> > > just
> > > disregard BEGIN and COMMIT statements, just as it parses but ignores
>
> > > CHECK and FOREIGN KEY constraints in table creation, thus
> illustrating
> >
> > > that you wouldn't need to add a mysql-with-transactions directory to
> > > the source tree.
> > >
> > > >
> > > > > So various people in this thread implying that MySQL isn't
> > > > > really
> > > > > a database need to do some more reading.
> > > >
> > > > If you're referring to my suggestion that the transaction
> > > > functions are a noop for MySQL, then you're reading too deeply...
> > > > I just
> > didn't
> > > > realize that InnoDB would handle transactions entirely normally.
> > >
> > > My apologies, I was not referring to you at all! Yourself and your
> > > collegues have provided the world with a very, very funky mail
> > > repository!
> > >
> > > I was referring to someone who said in the body of their message
> > > posted to another branch of this thread "the supposed database
> mysql".
> >
> > > Considering MySQL has proven itself time and time again in terms of
> > > reliability, licencing flexibility and performance, such statements
> > > are baseless regardless of what arguments you make for which
> features.
> >
> > > For example, Visual FoxPro's backend is technically a database but
> > > provides nothing in the way of a privilege system (you need write
> > > access to record locking, so filesystem-level controls are no good
> > > either).
> > >
> > > >
> > > > > In summary, just encapsulate everything in transaction blocks
> > > > > and the underlaying database will act appropriately.
> > > >
> > > > Right. That's what we're talking about.
> > > >
> > > > Is the entire point of your comment that we can safely use
> > > >
> > > > BEGIN;
> > > >
> > > > .. whatever dbmail does...
> > > >
> > > > COMMIT;
> > > >
> > > > regardless of whether or not we know if the host database actually
>
> > > > supports these keywords? You could have just said that.
> > >
> > > I could have, but then I would have risked putting an end to the
> > > thread.
> > > :-)
> > >
> > > Please accept my apology - I never meant to offend yourself or
> > > anyone
> > > else who contributes to DBMail. It seems my reading of the post I
> > > mentioned above caused me to be a bit heavier than I should have
> been.
> > >
> > > But yes, you have perfectly summarised the point of my comment.
> > > >
> > > > > Regards,
> > > > >
> > > > > Chris
> > > > >
> > > > > Aaron Stone wrote:
> > > > >
> > > > > >I don't even know where to begin in terms of designing the
> > > > > >delivery
> > >
> > > > > >chain around transactions. Could we do it as simply as adding
> > > > > >functions...
> > > > > >
> > > > > > void db_begin_transaction(void);
> > > > > > void db_flush_transaction(void); (or db_commit_...?)
> > > > > >
> > > > > >and then calling these functions before and after each major
> > > > > >section of database code? For the delivery chain, we could do
> > > > > >it inside of insert_message(). For dbmail-smtp, this basically
> > > > > >means
> >
> > > > > >that the execution of the whole program is within one
> > transaction.
> > > > > >For dbmail-lmtpd, it means that each message is delivered
> > > > > >within
> > a
> > > > > >transaction but the miscellaneous queries before the main
> > > > > >message
> >
> > > > > >delivery chain are not transacted. For MySQL, these functions
> > would
> > >
> > > > > >be noops.
> > > > > >
> > > > > >Thing that might work?
> > > > > >
> > > > > >Aaron
> > > > > >
> > > > > >
> > > > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > > > >
> > > > > >
> > > > > >
> > > > > >>Hi Aaron,
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>>Do you have any way of narrowing this down to specific
> > > > > >>>queries that are taking the longest and/or are being executed
>
> > > > > >>>the most?
> >
> > > > > >>>That would identify which low-level database functions are
> > being
> > > > > >>>called, then we can just trace our way up the call chain to
> > > > > >>>see
> >
> > > > > >>>who's misbehaving or acting on a flawed design. Also, if you
> > > > > >>>could run similar tests against the latest 1.2, it would help
> > to
> > > > > >>>give a frame of reference, particularly for my delivery chain
> > > > > >>>design.
> > > > > >>>
> > > > > >>>
> > > > > >>That's simple: the main design flaw (actually that's no design
>
> > > > > >>flaw I think, that's because the so called database MySQL
> > couldn't
> > >
> > > > > >>do transactions in the past) is that dbmail doesn't use
> > > > > >>transaction.
> > > > > >>
> > > > > >>Because of that AutoCommit is used and whenever dbmail does
> > > > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' - and
> > that
> > >
> > > > > >>is terribly slow. To ensure the Durability in ACID the
> > > > > >>database has to fflush() every transaction to stable storage!
> > > > > >>That's why there is only one solution: we have to use
> > > > > >>transaction.
> > > > > >>
> > > > > >>With transactions we could remove the integrity checks of
> > > > > >>dbmail-maintenance too, because the database guarantees
> > integrity.
> > > > > >>
> > > > > >>Anyway, I did a trace of all SQL queries when a mail is copied
>
> > > > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db
> > operations
> > > > > >>to insert one mail?).
> > > > > >>
> > > > > >>When searching for the sequential scan I found something quite
>
> > > > > >>interesting in the docs: the planer decides for every scan if
> > > > > >>a seqScan is cheaper that an index scan, and does a seqScan
> > > > > >>even
> > if
> > > > > >>an index
> > > > > >>exists:
> > > > > >>
> > > > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> > > owner_idnr=2;
> > > > > >> QUERY PLAN
> > > > > >>---------------------------------------------------------
> > > > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > > > >> Filter: (owner_idnr = 2)
> > > > > >>(2 rows)
> > > > > >>
> > > > > >>The table has an index on owner_idnr.
> > > > > >>
> > > > > >>So I should repeat this test with a database with several
> > > > > >>hundred
> > > > > >>to thousand user, several dozen mailboxes for each user and
> > > > > >>several dozen mails in each mailbox to find out if all
> required
> > > > > >>indizes are there. Did anyone write a script to create such a
> > > > > >>database?
> > > > > >>
> > > > > >>But I found a strange query:
> > > > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4'
> > > > > >>AND owner_idnr = '2' mailbox_idnr is the primary key so that
> > > > > >>could
> > be
> > > > > >>optimized to: SELECT 4
> > > > > >>;-)
> > > > > >>
> > > > > >>
> > > > > >>I don't have a 1.2 installation, I'm sorry.
> > > > > >>
> > > > > >>
> > > > > >>--
> > > > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> > > (95702B3B)
> > > > > >>_______________________________________________
> > > > > >>Dbmail-dev mailing list
> > > > > >>Dbmail-dev@dbmail.org
> > > > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Dbmail-dev mailing list
> > > > > Dbmail-dev@dbmail.org
> > > > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > > >
> > > >
> > > >
> > >
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev

1 2  View All