Mailing List Archive: some speed tests

RE: some speed tests [ In reply to ]

Feb 15, 2004, 4:47 AM

Post #26 of 30 (1339 views)

No, aspseek doesn't use transactions as mysql 3.23 in the standard
distro doesn't have them.
The version I wrote for postgres does however.

I'm not sure if the mysql mailing lists was consulted, but it was a
known problem on the aspseek mailing lists.

... John

-----Original Message-----
From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
On Behalf Of Chris Nolan
Sent: Sunday, February 15, 2004 9:56 PM
To: dbmail-dev@dbmail.org
Subject: RE: [Dbmail-dev] some speed tests

Hmm...does ASPSeek utilise transactions at all?

The fact that it happened with both points to some glaring bug inside
MySQL's upper layers or a hardware problem, but since PostgreSQL was
fine (I'm assuming on the same hardware), that's a totally discounted
possibility.

Did anyone on the MySQL mailing list give you any joy?

Regards,

Chris

On Sun, 2004-02-15 at 21:40, John Hansen wrote:
> We tried both innodb and myisam with the same results, except of
> course performance suffered when using the innodb type. However both
> types resulted in corrupted tables after a few days starting from
> scratch.
>
> Following that corruption happened daily.
>
> I abandoned mysql early last year as I rewrote the application to use
> postgresql and haven't had any problems since.
>
> Other reasons I abandoned mysql was that to my dismay I discovered
> that statements such as select .. From table for update; is allowed.
> This should definately throw an error since for update is not
> supported and thus would lead to inconsistent data.
>
> I have not attempted to use mysql 4.
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 9:31 PM
> To: dbmail-dev@dbmail.org
> Subject: RE: [Dbmail-dev] some speed tests
>
>
> Now you've peaked my curiosity!
>
> Do you have any additional details regarding table types used and
> whether or not the same thing occurs in the 4.0.x series?
>
> Regards,
>
> Chris
>
> On Sun, 2004-02-15 at 21:14, John Hansen wrote:
> > Download and install aspseek-1.2.10 from http://www.aspseek.org Run
> > this against mysql 3.23.49
> >
> > Run it for a few days, indexing about 1million urls,
> > Then keep reindexing them and watch your database corrupt...
> >
> > Regards,
> >
> > John
> >
> > -----Original Message-----
> > From: dbmail-dev-admin@dbmail.org
> > [mailto:dbmail-dev-admin@dbmail.org]
> > On Behalf Of Chris Nolan
> > Sent: Sunday, February 15, 2004 6:37 PM
> > To: dbmail-dev@dbmail.org
> > Subject: RE: [Dbmail-dev] some speed tests
> >
> >
> > Table corruptions?
> >
> > I'd be very interested to hear the story around your grief, if for
> > nothing else, for reference purposes. :-)
> >
> > Regards,
> >
> > Chris
> >
> > On Sun, 2004-02-15 at 17:19, John Hansen wrote:
> > > Personally, I recent the statement; 'Considering MySQL has proven
> > > itself time and time again in terms of reliability, licencing
> > > flexibility and performance, such statements are baseless
> > > regardless
>
> > > of what arguments you make for which features'; based on
> > > experience,
>
> > > mysql proved that it was not reliable, tho performance was top of
> the
> > > line.
> > >
> > > In my opinion, if you're concerned with reliability, mysql should
> > > not
> > > be an option. I had daily table corruptions, and thus one of my
main
>
> > > tasks was to run mysql repair jobs regularly, just to keep
> everything
> > > running.
> > >
> > > Granted, postgresql has it's own problems, mainly with performance
> > > and
> >
> > > lack of a proper master / slave replication model, but that's a
> > > small
> > > price to pay for reliability.
> > >
> > > Just my $0.02 worth.
> > >
> > > Regards,
> > >
> > > John Hansen
> > >
> > > -----Original Message-----
> > > From: dbmail-dev-admin@dbmail.org
> > > [mailto:dbmail-dev-admin@dbmail.org]
> > > On Behalf Of Chris Nolan
> > > Sent: Sunday, February 15, 2004 4:42 PM
> > > To: dbmail-dev@dbmail.org
> > > Subject: Re: [Dbmail-dev] some speed tests
> > >
> > >
> > > Dear Aaron,
> > >
> > > My comments are also inline. :-)
> > >
> > > On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > > > Comments inline...
> > > >
> > > > Chris Nolan <chris@imi.com.au> said:
> > > >
> > > > > Forgive the bluntness of the statement, but why is anyone even
> > > > > worrying about transactions as they relate to MySQL???
> > > >
> > > > Because we are currently structured to have a single set of
> > > > mid-level database operations that are translated into specific
> > > > low-level database function calls. The upside is that there
> > > > isn't any
> database
> >
> > > > specific handling outside of these mid-level functions. The
> > > > downside
> >
> > > > is that we need to make sure that function calls for certain
> > features
> > > > are consistent with analogous features in each database and that
> > they
> > > > carry with them enough information to make the appropriate
> > > > low-level
> >
> > > > call. A good example is the last inserted id number. In MySQL,
> > > > you only need the database connection identifier to get this. In

> > > > PostgreSQL, yo need both the database connection and the table
> > > > identifier. For DBMail to have a mid-level function that worked
> for
> > > > both, we'd have to make sure that it took both arguments and
> > > > used
> > them
> > >
> > > > as needed for whichever database's low-level calls were being
> > > > used. (Those were off the top of my head, so they may be
> > > > incorrect, but
> > they
> > >
> > > > do illustrate my point.)
> > >
> > > I've looked through the DBMail source code on a few occassions and

> > > even released a dodgy tool to convert Cyrus mailboxes to DBMail a
> > > while ago. The fact that I can use the calls present in the DB
> modules
> >
> > > and avoid having to write queries by hand for each different DB is
> > > an
> > > excellent feature of the DBMail API. I have nothing negative to
say
> > > here at all.
> > >
> > > >
> > > > > COMMIT in MySQL is passed to the table handler. In the case of
> > > > > MyISAM tables, the handler disregards the statement. For
InnoDB
> > > > > and BDB
> > > tables,
> > > > > COMMIT acts as it does in PostgreSQL.
> > > >
> > > > So then we have THREE different configurations to consider, and
> > > > need to be sure to design the mid-level interface appropriately.
> > >
> > > My statement was meant to say that the MyISAM table handler will
> > > just
> > > disregard BEGIN and COMMIT statements, just as it parses but
ignores
>
> > > CHECK and FOREIGN KEY constraints in table creation, thus
> illustrating
> >
> > > that you wouldn't need to add a mysql-with-transactions directory
> > > to the source tree.
> > >
> > > >
> > > > > So various people in this thread implying that MySQL isn't
> > > > > really
> > > > > a database need to do some more reading.
> > > >
> > > > If you're referring to my suggestion that the transaction
> > > > functions are a noop for MySQL, then you're reading too
deeply...
> > > > I just
> > didn't
> > > > realize that InnoDB would handle transactions entirely normally.
> > >
> > > My apologies, I was not referring to you at all! Yourself and your

> > > collegues have provided the world with a very, very funky mail
> > > repository!
> > >
> > > I was referring to someone who said in the body of their message
> > > posted to another branch of this thread "the supposed database
> mysql".
> >
> > > Considering MySQL has proven itself time and time again in terms
> > > of reliability, licencing flexibility and performance, such
> > > statements are baseless regardless of what arguments you make for
> > > which
> features.
> >
> > > For example, Visual FoxPro's backend is technically a database but

> > > provides nothing in the way of a privilege system (you need write
> > > access to record locking, so filesystem-level controls are no good

> > > either).
> > >
> > > >
> > > > > In summary, just encapsulate everything in transaction blocks
> > > > > and the underlaying database will act appropriately.
> > > >
> > > > Right. That's what we're talking about.
> > > >
> > > > Is the entire point of your comment that we can safely use
> > > >
> > > > BEGIN;
> > > >
> > > > .. whatever dbmail does...
> > > >
> > > > COMMIT;
> > > >
> > > > regardless of whether or not we know if the host database
> > > > actually
>
> > > > supports these keywords? You could have just said that.
> > >
> > > I could have, but then I would have risked putting an end to the
> > > thread.
> > > :-)
> > >
> > > Please accept my apology - I never meant to offend yourself or
> > > anyone
> > > else who contributes to DBMail. It seems my reading of the post I
> > > mentioned above caused me to be a bit heavier than I should have
> been.
> > >
> > > But yes, you have perfectly summarised the point of my comment.
> > > >
> > > > > Regards,
> > > > >
> > > > > Chris
> > > > >
> > > > > Aaron Stone wrote:
> > > > >
> > > > > >I don't even know where to begin in terms of designing the
> > > > > >delivery
> > >
> > > > > >chain around transactions. Could we do it as simply as adding
> > > > > >functions...
> > > > > >
> > > > > > void db_begin_transaction(void);
> > > > > > void db_flush_transaction(void); (or db_commit_...?)
> > > > > >
> > > > > >and then calling these functions before and after each major
> > > > > >section of database code? For the delivery chain, we could do

> > > > > >it inside of insert_message(). For dbmail-smtp, this
basically
> > > > > >means
> >
> > > > > >that the execution of the whole program is within one
> > transaction.
> > > > > >For dbmail-lmtpd, it means that each message is delivered
> > > > > >within
> > a
> > > > > >transaction but the miscellaneous queries before the main
> > > > > >message
> >
> > > > > >delivery chain are not transacted. For MySQL, these functions
> > would
> > >
> > > > > >be noops.
> > > > > >
> > > > > >Thing that might work?
> > > > > >
> > > > > >Aaron
> > > > > >
> > > > > >
> > > > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > > > >
> > > > > >
> > > > > >
> > > > > >>Hi Aaron,
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>>Do you have any way of narrowing this down to specific
> > > > > >>>queries that are taking the longest and/or are being
executed
>
> > > > > >>>the most?
> >
> > > > > >>>That would identify which low-level database functions are
> > being
> > > > > >>>called, then we can just trace our way up the call chain to
> > > > > >>>see
> >
> > > > > >>>who's misbehaving or acting on a flawed design. Also, if
> > > > > >>>you could run similar tests against the latest 1.2, it
> > > > > >>>would help
> > to
> > > > > >>>give a frame of reference, particularly for my delivery
> > > > > >>>chain design.
> > > > > >>>
> > > > > >>>
> > > > > >>That's simple: the main design flaw (actually that's no
> > > > > >>design
>
> > > > > >>flaw I think, that's because the so called database MySQL
> > couldn't
> > >
> > > > > >>do transactions in the past) is that dbmail doesn't use
> > > > > >>transaction.
> > > > > >>
> > > > > >>Because of that AutoCommit is used and whenever dbmail does
> > > > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' -
and
> > that
> > >
> > > > > >>is terribly slow. To ensure the Durability in ACID the
> > > > > >>database has to fflush() every transaction to stable
storage!
> > > > > >>That's why there is only one solution: we have to use
> > > > > >>transaction.
> > > > > >>
> > > > > >>With transactions we could remove the integrity checks of
> > > > > >>dbmail-maintenance too, because the database guarantees
> > integrity.
> > > > > >>
> > > > > >>Anyway, I did a trace of all SQL queries when a mail is
> > > > > >>copied
>
> > > > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db
> > operations
> > > > > >>to insert one mail?).
> > > > > >>
> > > > > >>When searching for the sequential scan I found something
> > > > > >>quite
>
> > > > > >>interesting in the docs: the planer decides for every scan
> > > > > >>if
> > > > > >>a seqScan is cheaper that an index scan, and does a seqScan
> > > > > >>even
> > if
> > > > > >>an index
> > > > > >>exists:
> > > > > >>
> > > > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> > > owner_idnr=2;
> > > > > >> QUERY PLAN
> > > > > >>---------------------------------------------------------
> > > > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > > > >> Filter: (owner_idnr = 2)
> > > > > >>(2 rows)
> > > > > >>
> > > > > >>The table has an index on owner_idnr.
> > > > > >>
> > > > > >>So I should repeat this test with a database with several
> > > > > >>hundred to thousand user, several dozen mailboxes for each
> > > > > >>user and several dozen mails in each mailbox to find out if
> > > > > >>all
> required
> > > > > >>indizes are there. Did anyone write a script to create such
> > > > > >>a
> > > > > >>database?
> > > > > >>
> > > > > >>But I found a strange query:
> > > > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4'
> > > > > >>AND owner_idnr = '2' mailbox_idnr is the primary key so that

> > > > > >>could
> > be
> > > > > >>optimized to: SELECT 4
> > > > > >>;-)
> > > > > >>
> > > > > >>
> > > > > >>I don't have a 1.2 installation, I'm sorry.
> > > > > >>
> > > > > >>
> > > > > >>--
> > > > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> > > (95702B3B)
> > > > > >>_______________________________________________
> > > > > >>Dbmail-dev mailing list
> > > > > >>Dbmail-dev@dbmail.org
> > > > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Dbmail-dev mailing list
> > > > > Dbmail-dev@dbmail.org
> > > > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > > >
> > > >
> > > >
> > >
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev

RE: some speed tests [ In reply to ]

chris at imi

Feb 15, 2004, 5:12 AM

Post #27 of 30 (1349 views)

Permalink

Several of my clients use DBMail with mail loads ranging from small (a
few users) to massive (100s of users). We use MySQL 4.0.x (x >= 13) in
all of these places and haven't noticed any corruption but that says
nothing about what you've experienced as the two loads are very, very
different to say the least.

Thanks for sharing your experiences! At a bare minimum, I'll know that
3.23.49 is a bad version of MySQL.

Regards,

Chris

On Sun, 2004-02-15 at 22:47, John Hansen wrote:
> No, aspseek doesn't use transactions as mysql 3.23 in the standard
> distro doesn't have them.
> The version I wrote for postgres does however.
>
> I'm not sure if the mysql mailing lists was consulted, but it was a
> known problem on the aspseek mailing lists.
>
> ... John
>
> -----Original Message-----
> From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> On Behalf Of Chris Nolan
> Sent: Sunday, February 15, 2004 9:56 PM
> To: dbmail-dev@dbmail.org
> Subject: RE: [Dbmail-dev] some speed tests
>
>
> Hmm...does ASPSeek utilise transactions at all?
>
> The fact that it happened with both points to some glaring bug inside
> MySQL's upper layers or a hardware problem, but since PostgreSQL was
> fine (I'm assuming on the same hardware), that's a totally discounted
> possibility.
>
> Did anyone on the MySQL mailing list give you any joy?
>
> Regards,
>
> Chris
>
> On Sun, 2004-02-15 at 21:40, John Hansen wrote:
> > We tried both innodb and myisam with the same results, except of
> > course performance suffered when using the innodb type. However both
> > types resulted in corrupted tables after a few days starting from
> > scratch.
> >
> > Following that corruption happened daily.
> >
> > I abandoned mysql early last year as I rewrote the application to use
> > postgresql and haven't had any problems since.
> >
> > Other reasons I abandoned mysql was that to my dismay I discovered
> > that statements such as select .. From table for update; is allowed.
> > This should definately throw an error since for update is not
> > supported and thus would lead to inconsistent data.
> >
> > I have not attempted to use mysql 4.
> >
> > -----Original Message-----
> > From: dbmail-dev-admin@dbmail.org [mailto:dbmail-dev-admin@dbmail.org]
> > On Behalf Of Chris Nolan
> > Sent: Sunday, February 15, 2004 9:31 PM
> > To: dbmail-dev@dbmail.org
> > Subject: RE: [Dbmail-dev] some speed tests
> >
> >
> > Now you've peaked my curiosity!
> >
> > Do you have any additional details regarding table types used and
> > whether or not the same thing occurs in the 4.0.x series?
> >
> > Regards,
> >
> > Chris
> >
> > On Sun, 2004-02-15 at 21:14, John Hansen wrote:
> > > Download and install aspseek-1.2.10 from http://www.aspseek.org Run
> > > this against mysql 3.23.49
> > >
> > > Run it for a few days, indexing about 1million urls,
> > > Then keep reindexing them and watch your database corrupt...
> > >
> > > Regards,
> > >
> > > John
> > >
> > > -----Original Message-----
> > > From: dbmail-dev-admin@dbmail.org
> > > [mailto:dbmail-dev-admin@dbmail.org]
> > > On Behalf Of Chris Nolan
> > > Sent: Sunday, February 15, 2004 6:37 PM
> > > To: dbmail-dev@dbmail.org
> > > Subject: RE: [Dbmail-dev] some speed tests
> > >
> > >
> > > Table corruptions?
> > >
> > > I'd be very interested to hear the story around your grief, if for
> > > nothing else, for reference purposes. :-)
> > >
> > > Regards,
> > >
> > > Chris
> > >
> > > On Sun, 2004-02-15 at 17:19, John Hansen wrote:
> > > > Personally, I recent the statement; 'Considering MySQL has proven
> > > > itself time and time again in terms of reliability, licencing
> > > > flexibility and performance, such statements are baseless
> > > > regardless
> >
> > > > of what arguments you make for which features'; based on
> > > > experience,
> >
> > > > mysql proved that it was not reliable, tho performance was top of
> > the
> > > > line.
> > > >
> > > > In my opinion, if you're concerned with reliability, mysql should
> > > > not
> > > > be an option. I had daily table corruptions, and thus one of my
> main
> >
> > > > tasks was to run mysql repair jobs regularly, just to keep
> > everything
> > > > running.
> > > >
> > > > Granted, postgresql has it's own problems, mainly with performance
> > > > and
> > >
> > > > lack of a proper master / slave replication model, but that's a
> > > > small
> > > > price to pay for reliability.
> > > >
> > > > Just my $0.02 worth.
> > > >
> > > > Regards,
> > > >
> > > > John Hansen
> > > >
> > > > -----Original Message-----
> > > > From: dbmail-dev-admin@dbmail.org
> > > > [mailto:dbmail-dev-admin@dbmail.org]
> > > > On Behalf Of Chris Nolan
> > > > Sent: Sunday, February 15, 2004 4:42 PM
> > > > To: dbmail-dev@dbmail.org
> > > > Subject: Re: [Dbmail-dev] some speed tests
> > > >
> > > >
> > > > Dear Aaron,
> > > >
> > > > My comments are also inline. :-)
> > > >
> > > > On Sun, 2004-02-15 at 12:27, Aaron Stone wrote:
> > > > > Comments inline...
> > > > >
> > > > > Chris Nolan <chris@imi.com.au> said:
> > > > >
> > > > > > Forgive the bluntness of the statement, but why is anyone even
> > > > > > worrying about transactions as they relate to MySQL???
> > > > >
> > > > > Because we are currently structured to have a single set of
> > > > > mid-level database operations that are translated into specific
> > > > > low-level database function calls. The upside is that there
> > > > > isn't any
> > database
> > >
> > > > > specific handling outside of these mid-level functions. The
> > > > > downside
> > >
> > > > > is that we need to make sure that function calls for certain
> > > features
> > > > > are consistent with analogous features in each database and that
> > > they
> > > > > carry with them enough information to make the appropriate
> > > > > low-level
> > >
> > > > > call. A good example is the last inserted id number. In MySQL,
> > > > > you only need the database connection identifier to get this. In
>
> > > > > PostgreSQL, yo need both the database connection and the table
> > > > > identifier. For DBMail to have a mid-level function that worked
> > for
> > > > > both, we'd have to make sure that it took both arguments and
> > > > > used
> > > them
> > > >
> > > > > as needed for whichever database's low-level calls were being
> > > > > used. (Those were off the top of my head, so they may be
> > > > > incorrect, but
> > > they
> > > >
> > > > > do illustrate my point.)
> > > >
> > > > I've looked through the DBMail source code on a few occassions and
>
> > > > even released a dodgy tool to convert Cyrus mailboxes to DBMail a
> > > > while ago. The fact that I can use the calls present in the DB
> > modules
> > >
> > > > and avoid having to write queries by hand for each different DB is
> > > > an
> > > > excellent feature of the DBMail API. I have nothing negative to
> say
> > > > here at all.
> > > >
> > > > >
> > > > > > COMMIT in MySQL is passed to the table handler. In the case of
> > > > > > MyISAM tables, the handler disregards the statement. For
> InnoDB
> > > > > > and BDB
> > > > tables,
> > > > > > COMMIT acts as it does in PostgreSQL.
> > > > >
> > > > > So then we have THREE different configurations to consider, and
> > > > > need to be sure to design the mid-level interface appropriately.
> > > >
> > > > My statement was meant to say that the MyISAM table handler will
> > > > just
> > > > disregard BEGIN and COMMIT statements, just as it parses but
> ignores
> >
> > > > CHECK and FOREIGN KEY constraints in table creation, thus
> > illustrating
> > >
> > > > that you wouldn't need to add a mysql-with-transactions directory
> > > > to the source tree.
> > > >
> > > > >
> > > > > > So various people in this thread implying that MySQL isn't
> > > > > > really
> > > > > > a database need to do some more reading.
> > > > >
> > > > > If you're referring to my suggestion that the transaction
> > > > > functions are a noop for MySQL, then you're reading too
> deeply...
> > > > > I just
> > > didn't
> > > > > realize that InnoDB would handle transactions entirely normally.
> > > >
> > > > My apologies, I was not referring to you at all! Yourself and your
>
> > > > collegues have provided the world with a very, very funky mail
> > > > repository!
> > > >
> > > > I was referring to someone who said in the body of their message
> > > > posted to another branch of this thread "the supposed database
> > mysql".
> > >
> > > > Considering MySQL has proven itself time and time again in terms
> > > > of reliability, licencing flexibility and performance, such
> > > > statements are baseless regardless of what arguments you make for
> > > > which
> > features.
> > >
> > > > For example, Visual FoxPro's backend is technically a database but
>
> > > > provides nothing in the way of a privilege system (you need write
> > > > access to record locking, so filesystem-level controls are no good
>
> > > > either).
> > > >
> > > > >
> > > > > > In summary, just encapsulate everything in transaction blocks
> > > > > > and the underlaying database will act appropriately.
> > > > >
> > > > > Right. That's what we're talking about.
> > > > >
> > > > > Is the entire point of your comment that we can safely use
> > > > >
> > > > > BEGIN;
> > > > >
> > > > > .. whatever dbmail does...
> > > > >
> > > > > COMMIT;
> > > > >
> > > > > regardless of whether or not we know if the host database
> > > > > actually
> >
> > > > > supports these keywords? You could have just said that.
> > > >
> > > > I could have, but then I would have risked putting an end to the
> > > > thread.
> > > > :-)
> > > >
> > > > Please accept my apology - I never meant to offend yourself or
> > > > anyone
> > > > else who contributes to DBMail. It seems my reading of the post I
> > > > mentioned above caused me to be a bit heavier than I should have
> > been.
> > > >
> > > > But yes, you have perfectly summarised the point of my comment.
> > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Chris
> > > > > >
> > > > > > Aaron Stone wrote:
> > > > > >
> > > > > > >I don't even know where to begin in terms of designing the
> > > > > > >delivery
> > > >
> > > > > > >chain around transactions. Could we do it as simply as adding
> > > > > > >functions...
> > > > > > >
> > > > > > > void db_begin_transaction(void);
> > > > > > > void db_flush_transaction(void); (or db_commit_...?)
> > > > > > >
> > > > > > >and then calling these functions before and after each major
> > > > > > >section of database code? For the delivery chain, we could do
>
> > > > > > >it inside of insert_message(). For dbmail-smtp, this
> basically
> > > > > > >means
> > >
> > > > > > >that the execution of the whole program is within one
> > > transaction.
> > > > > > >For dbmail-lmtpd, it means that each message is delivered
> > > > > > >within
> > > a
> > > > > > >transaction but the miscellaneous queries before the main
> > > > > > >message
> > >
> > > > > > >delivery chain are not transacted. For MySQL, these functions
> > > would
> > > >
> > > > > > >be noops.
> > > > > > >
> > > > > > >Thing that might work?
> > > > > > >
> > > > > > >Aaron
> > > > > > >
> > > > > > >
> > > > > > >Thomas Mueller <dbmail@tmueller.com> said:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >>Hi Aaron,
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >>>Do you have any way of narrowing this down to specific
> > > > > > >>>queries that are taking the longest and/or are being
> executed
> >
> > > > > > >>>the most?
> > >
> > > > > > >>>That would identify which low-level database functions are
> > > being
> > > > > > >>>called, then we can just trace our way up the call chain to
> > > > > > >>>see
> > >
> > > > > > >>>who's misbehaving or acting on a flawed design. Also, if
> > > > > > >>>you could run similar tests against the latest 1.2, it
> > > > > > >>>would help
> > > to
> > > > > > >>>give a frame of reference, particularly for my delivery
> > > > > > >>>chain design.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>That's simple: the main design flaw (actually that's no
> > > > > > >>design
> >
> > > > > > >>flaw I think, that's because the so called database MySQL
> > > couldn't
> > > >
> > > > > > >>do transactions in the past) is that dbmail doesn't use
> > > > > > >>transaction.
> > > > > > >>
> > > > > > >>Because of that AutoCommit is used and whenever dbmail does
> > > > > > >>anySqlQuery Postgres does 'BEGIN; anySqlQuery; COMMIT;' -
> and
> > > that
> > > >
> > > > > > >>is terribly slow. To ensure the Durability in ACID the
> > > > > > >>database has to fflush() every transaction to stable
> storage!
> > > > > > >>That's why there is only one solution: we have to use
> > > > > > >>transaction.
> > > > > > >>
> > > > > > >>With transactions we could remove the integrity checks of
> > > > > > >>dbmail-maintenance too, because the database guarantees
> > > integrity.
> > > > > > >>
> > > > > > >>Anyway, I did a trace of all SQL queries when a mail is
> > > > > > >>copied
> >
> > > > > > >>using IMAP. I got 35 SELECT, 4 INSERT, 5 UPDATE (44 db
> > > operations
> > > > > > >>to insert one mail?).
> > > > > > >>
> > > > > > >>When searching for the sequential scan I found something
> > > > > > >>quite
> >
> > > > > > >>interesting in the docs: the planer decides for every scan
> > > > > > >>if
> > > > > > >>a seqScan is cheaper that an index scan, and does a seqScan
> > > > > > >>even
> > > if
> > > > > > >>an index
> > > > > > >>exists:
> > > > > > >>
> > > > > > >>dbmail=> explain SELECT mailbox_idnr FROM mailboxes WHERE
> > > > owner_idnr=2;
> > > > > > >> QUERY PLAN
> > > > > > >>---------------------------------------------------------
> > > > > > >> Seq Scan on mailboxes (cost=0.00..1.06 rows=3 width=8)
> > > > > > >> Filter: (owner_idnr = 2)
> > > > > > >>(2 rows)
> > > > > > >>
> > > > > > >>The table has an index on owner_idnr.
> > > > > > >>
> > > > > > >>So I should repeat this test with a database with several
> > > > > > >>hundred to thousand user, several dozen mailboxes for each
> > > > > > >>user and several dozen mails in each mailbox to find out if
> > > > > > >>all
> > required
> > > > > > >>indizes are there. Did anyone write a script to create such
> > > > > > >>a
> > > > > > >>database?
> > > > > > >>
> > > > > > >>But I found a strange query:
> > > > > > >>SELECT mailbox_idnr FROM mailboxes WHERE mailbox_idnr = '4'
> > > > > > >>AND owner_idnr = '2' mailbox_idnr is the primary key so that
>
> > > > > > >>could
> > > be
> > > > > > >>optimized to: SELECT 4
> > > > > > >>;-)
> > > > > > >>
> > > > > > >>
> > > > > > >>I don't have a 1.2 installation, I'm sorry.
> > > > > > >>
> > > > > > >>
> > > > > > >>--
> > > > > > >>MfG Thomas Mueller - http://www.tmueller.com for pgp key
> > > > (95702B3B)
> > > > > > >>_______________________________________________
> > > > > > >>Dbmail-dev mailing list
> > > > > > >>Dbmail-dev@dbmail.org
> > > > > > >>http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > Dbmail-dev mailing list
> > > > > > Dbmail-dev@dbmail.org
> > > > > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > > > >
> > > > >
> > > > >
> > > >
> > > > _______________________________________________
> > > > Dbmail-dev mailing list
> > > > Dbmail-dev@dbmail.org
> > > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > > _______________________________________________
> > > > Dbmail-dev mailing list
> > > > Dbmail-dev@dbmail.org
> > > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > >
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > > _______________________________________________
> > > Dbmail-dev mailing list
> > > Dbmail-dev@dbmail.org
> > > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev

Re: some speed tests [ In reply to ]

dbmail at tmueller

Feb 15, 2004, 6:47 AM

Post #28 of 30 (1342 views)

Permalink

Hi Chris,

> > > My statement was meant to say that the MyISAM table handler will just
> > > disregard BEGIN and COMMIT statements, just as it parses but ignores
> > > CHECK and FOREIGN KEY constraints in table creation, thus illustrating
> > > that you wouldn't need to add a mysql-with-transactions directory to the
> > > source tree.
> >
> > But don't we have to differentiate between a database with and a
> > database without transactions?
> > If we have transactions we can simply do all the database stuff, if
> > anything fails we do the rollback and bail out. We still have a
> > perfectly consistent database.
> > Without transactions we have to remember every successful database
> > modification we made and have to revert that.
>
> That would be very difficult in the case of UPDATEs though.
> Additionally, with MyISAM tables there is a problem with race hazards.
> For example:
>
> SELECT * FROM table WHERE id = 3;
> UPDATE table SET attr = 'thing' WHERE id = 3;
> // Error occurs
> UPDATE table SET attr = $oldvalue WHERE id = 3;
>
> In the above statement, you have no way of ensuring that no one else has
> buggered around with the row we're interested in. As DBMail's queries
> seem to be primarily INSERTs and SELECTs though, this isn't as big a
> problem as using LOCK TABLE statements would not result in a noticable
> performance hit.

Yes adding a new message is the only critical part I think. If we have
the message in the database there are only three possible updates:
change read status, move to another mailbox, delete mail. Every single
update can fail, no problem.

But the insert consists of 35 SELECT, 4 INSERT and 5 UPDATE.

So there is only one solution: transactions :-)

> > > I was referring to someone who said in the body of their message posted
> > > to another branch of this thread "the supposed database mysql".
> > > Considering MySQL has proven itself time and time again in terms of
> > > reliability, licencing flexibility and performance, such statements are
> > > baseless regardless of what arguments you make for which features.
> >
> > That was me, I'm sorry! I didn't want to start a MySQL bashing thread. I
> > had lot of problems with MySQL in the past, that's why I said that. In my
> > eyes a database should support some basic things to have the right to
> > carry the name SQL database. Among these are transactions, foreign key
> > constraints, stored procedures, trigger, subselects. MySQL had not even
> > one of these features (today they probably have most of these features)
> > leading to poorly designed software, where the application had to do lot
> > of work the database should do (a good example in dbmail are the quotas
> > - the database could and should calculate that on its own).
> > The case insensitiveness of MySQL is another point where I had lot of
> > problems in the past. I only know two case insensitive databases: MySQL
> > and MS Access. Every other database is case sensitive (Postgres, Oracle,
> > MS SQL, what-ever).
> > And one should never forget the dual license of MySQL.
> > On the other hand people tell me that MySQL is faster than Postgres -
> > that's simply wrong. It's only faster if you use the database type with
> > less features. There are tons of threads on the postgresql list where
> > these two databases (and others) are compared.
> > That's why my heart beats faster when I hear the name MySQL - I'm sorry,
> > I'll try to suppress these sidekicks.
>
> The reason I react to statements against my favourite database, MySQL,
> is simply due to the things I have done with it and things I have seen
> others do with it.

There are lot servers in the world with critical data for companies on
cheap x86 hardware with Windows <what ever>. So I can conclude that
this combination is good for that usage? I don't think so.
But this is really off topic at the moment :)

> Today, MySQL does have transactions. foreign key constraints and
> subselect and stored procedure functionality are in beta.

What about trigger?

> MySQL's case insensitivity is a result of having to compensate for
> Windows' terrible deficencies in some areas and a design choice in
> others (which is actually a pretty stupid design choice if you ask me -
> case-sensitive string comparisons are always faster).

Why the Windows part? MS SQL is case sensitive.

> Regarding the comparisons you speak of, could you point me to some of
> them? I'm yet to see a recent, decent comparison between the two. I'd
> love to see PostgreSQL 7.4.1 vs MySQL 4.0.17 / 4.1.1 on Linux 2.6.x or
> FreeBSD 5.2.x .

If you point your newsreader to gmane.org and search in the group
gmane.comp.db.postgresql.general and
gmane.comp.db.postgresql.performance for topics with mysql you'll find
lot of interesting threads.

The problem is that you can't really compare Postgres with MySQL. You
could compare Postgres and Oracle for example, but MySQL has lot of
features missing so it has to be faster.
Comparing Postgres with a current Beta (of MySQL 5?) would be on the one
hand fair because they have similar features, on the other hand it's not
fair because Postgres has these features for years, in MySQL they are
brandnew and beta.

So people tend to compare the same application with both Postgres and
MySQL. Not fair again because quite a lot of intelligence is done in the
application you would normally let the database do if you didn't have to
take care of MySQL.

I would recommend to use MySQL for applications where you have 100 read
operations and one write, there MySQL is unbeaten fast. For critical
data I would never recommend MySQL. At the moment, this might change in
future.
BTW: I wouldn't always recommend Postgres for that job too. Some
commercial database can have several advantages - so always use the best
database for the job to do :-)

> I find your comment regarding the dual licence very interesting. If
> you'd like to continue this (hopefully friendly) discussion, please
> write to me off the list. :-)

I always try to be friendly :-), but I don't see anything to discuss.
The license says everything.

> > > > BEGIN;
> > > > .. whatever dbmail does...
> > > > COMMIT;
[..]
> > As a quick speed up that's a good solution yes. But if it's the right
> > permanent solution?
>
> Personally, I think it is. MyISAM in MySQL is a great table type for lots
> and lots of INSERT statements OR lots and lots
> of SELECT statements but not both (quoting Jeremy Zawdony from Yahoo!).
> Anyone serious about performance in a case like DBMail would probably
> use InnoDB anyway. The fact that it is now part of the standard MySQL
> distro gives little reason to not use MyISAM tables except for their
> ability to perform FULLTEXT searches.

Sorry is this a typo or did you really mean MyISAM? So you wouldn't use
transactions but let MySQL simply ignore the transaction commands?

--
MfG Thomas Mueller - http://www.tmueller.com for pgp key (95702B3B)

Re: some speed tests [ In reply to ]

chris at imi

Feb 15, 2004, 7:32 AM

Post #29 of 30 (1338 views)

Permalink

Thomas Mueller wrote:

>Hi Chris,
>
>
>
>>>>My statement was meant to say that the MyISAM table handler will just
>>>>disregard BEGIN and COMMIT statements, just as it parses but ignores
>>>>CHECK and FOREIGN KEY constraints in table creation, thus illustrating
>>>>that you wouldn't need to add a mysql-with-transactions directory to the
>>>>source tree.
>>>>
>>>>
>>>But don't we have to differentiate between a database with and a
>>>database without transactions?
>>>If we have transactions we can simply do all the database stuff, if
>>>anything fails we do the rollback and bail out. We still have a
>>>perfectly consistent database.
>>>Without transactions we have to remember every successful database
>>>modification we made and have to revert that.
>>>
>>>
>>That would be very difficult in the case of UPDATEs though.
>>Additionally, with MyISAM tables there is a problem with race hazards.
>>For example:
>>
>>SELECT * FROM table WHERE id = 3;
>>UPDATE table SET attr = 'thing' WHERE id = 3;
>>// Error occurs
>>UPDATE table SET attr = $oldvalue WHERE id = 3;
>>
>>In the above statement, you have no way of ensuring that no one else has
>>buggered around with the row we're interested in. As DBMail's queries
>>seem to be primarily INSERTs and SELECTs though, this isn't as big a
>>problem as using LOCK TABLE statements would not result in a noticable
>>performance hit.
>>
>>
>
>Yes adding a new message is the only critical part I think. If we have
>the message in the database there are only three possible updates:
>change read status, move to another mailbox, delete mail. Every single
>update can fail, no problem.
>
>But the insert consists of 35 SELECT, 4 INSERT and 5 UPDATE.
>
>So there is only one solution: transactions :-)
>
>
Right then, transactions are the solution!

>
>
>>>>I was referring to someone who said in the body of their message posted
>>>>to another branch of this thread "the supposed database mysql".
>>>>Considering MySQL has proven itself time and time again in terms of
>>>>reliability, licencing flexibility and performance, such statements are
>>>>baseless regardless of what arguments you make for which features.
>>>>
>>>>
>>>That was me, I'm sorry! I didn't want to start a MySQL bashing thread. I
>>>had lot of problems with MySQL in the past, that's why I said that. In my
>>>eyes a database should support some basic things to have the right to
>>>carry the name SQL database. Among these are transactions, foreign key
>>>constraints, stored procedures, trigger, subselects. MySQL had not even
>>>one of these features (today they probably have most of these features)
>>>leading to poorly designed software, where the application had to do lot
>>>of work the database should do (a good example in dbmail are the quotas
>>>- the database could and should calculate that on its own).
>>>The case insensitiveness of MySQL is another point where I had lot of
>>>problems in the past. I only know two case insensitive databases: MySQL
>>>and MS Access. Every other database is case sensitive (Postgres, Oracle,
>>>MS SQL, what-ever).
>>>And one should never forget the dual license of MySQL.
>>>On the other hand people tell me that MySQL is faster than Postgres -
>>>that's simply wrong. It's only faster if you use the database type with
>>>less features. There are tons of threads on the postgresql list where
>>>these two databases (and others) are compared.
>>>That's why my heart beats faster when I hear the name MySQL - I'm sorry,
>>>I'll try to suppress these sidekicks.
>>>
>>>
>>The reason I react to statements against my favourite database, MySQL,
>>is simply due to the things I have done with it and things I have seen
>>others do with it.
>>
>>
>
>There are lot servers in the world with critical data for companies on
>cheap x86 hardware with Windows <what ever>. So I can conclude that
>this combination is good for that usage? I don't think so.
>But this is really off topic at the moment :)
>
>
We could easily make it more off-topic, but how to do so would be
off-topic...

>
>
>>Today, MySQL does have transactions. foreign key constraints and
>>subselect and stored procedure functionality are in beta.
>>
>>
>
>What about trigger?
>
>
If I recall correctly, triggers are in the 5.1 timeframe. Makes sense,
considering you want solid stored procedures to use in your triggers.
The possibilty of having pluggable stored procedure modules is a nice
though though - the number of times I have had clients look at me in a
state of shock when I tell them how much a Windows 2003 Server + SQL
Server setup is going to cost them for whatever application they need it
for.

>
>
>>MySQL's case insensitivity is a result of having to compensate for
>>Windows' terrible deficencies in some areas and a design choice in
>>others (which is actually a pretty stupid design choice if you ask me -
>>case-sensitive string comparisons are always faster).
>>
>>
>
>Why the Windows part? MS SQL is case sensitive.
>
>
MySQL stores table definitions as seperate files. Due to the Win32 API
not being fully case-aware, table names on Windows are always
case-insensitive. I believe there is a MySQL option to alter case
sensitivity though.

>
>
>>Regarding the comparisons you speak of, could you point me to some of
>>them? I'm yet to see a recent, decent comparison between the two. I'd
>>love to see PostgreSQL 7.4.1 vs MySQL 4.0.17 / 4.1.1 on Linux 2.6.x or
>>FreeBSD 5.2.x .
>>
>>
>
>If you point your newsreader to gmane.org and search in the group
>gmane.comp.db.postgresql.general and
>gmane.comp.db.postgresql.performance for topics with mysql you'll find
>lot of interesting threads.
>
>The problem is that you can't really compare Postgres with MySQL. You
>could compare Postgres and Oracle for example, but MySQL has lot of
>features missing so it has to be faster.
>
>
That's also a bit of a generalisation. For instance, MS SQL Server has a
lot of features missing compared to DB2 and Oracle. Yet, there are
plenty of benchmarks showing DB2 and Oracle dominating SQL Server and
qutie a few that go the other way. The correlation between feature set
and level of performance is erratic at best (where Pearson's r is about 0).

>Comparing Postgres with a current Beta (of MySQL 5?) would be on the one
>hand fair because they have similar features, on the other hand it's not
>fair because Postgres has these features for years, in MySQL they are
>brandnew and beta.
>
>
True, but MySQL has always put performance first. It is rare that a
production release is significantly faster than the first alpha release
in that series (for instance, 4.0.1 vs 4.0.18).

>So people tend to compare the same application with both Postgres and
>MySQL. Not fair again because quite a lot of intelligence is done in the
>application you would normally let the database do if you didn't have to
>take care of MySQL.
>
>
This can be quite fair, as many applications are built to the
lowest-common-denominator, which would be FoxPro. :-)

In all seriousness though, if you have an application that does things
in the application regardless of database used, the comparison is fair.

>I would recommend to use MySQL for applications where you have 100 read
>operations and one write, there MySQL is unbeaten fast. For critical
>data I would never recommend MySQL. At the moment, this might change in
>future.
>
>
MySQL + InnoDB is one blazing combination. InnoDB is multiversioned,
just like PostgreSQL. InnoDB also has the advantage of being one of the
youngest mutli-versioning implementations around. The fact that phantom
rows can not appear at any isolation level allows for higher performance
in some applications through not needing to go to SERIALIZABLE isolation.

Additionally, InnoDB Hot Backup is an excellent backup solution for
maintaining performance during a hot backup, not placing any additional
strain on the database than that caused by the users logged in. pg_dump
acts as a normal client app.

>BTW: I wouldn't always recommend Postgres for that job too. Some
>commercial database can have several advantages - so always use the best
>database for the job to do :-)
>
>
>
Indeed! The only thing you need to remember is that the best database is
rarely a Microsoft product - Access is slow, FoxPro's data storage is a
sick joke and MS SQL Server does some very disturbing things to maintain
the level of performance advertised.

>>I find your comment regarding the dual licence very interesting. If
>>you'd like to continue this (hopefully friendly) discussion, please
>>write to me off the list. :-)
>>
>>
>
>I always try to be friendly :-), but I don't see anything to discuss.
>The license says everything.
>
>
The way I see it, it makes sense. Either you give back to the open
source community through software or you ensure that open source
software continues to develop through funding some of that development.
Call it karma if you will.

>
>
>>>>>BEGIN;
>>>>>.. whatever dbmail does...
>>>>>COMMIT;
>>>>>
>>>>>
>[..]
>
>
>>>As a quick speed up that's a good solution yes. But if it's the right
>>>permanent solution?
>>>
>>>
>>Personally, I think it is. MyISAM in MySQL is a great table type for lots
>>and lots of INSERT statements OR lots and lots
>>of SELECT statements but not both (quoting Jeremy Zawdony from Yahoo!).
>>Anyone serious about performance in a case like DBMail would probably
>>use InnoDB anyway. The fact that it is now part of the standard MySQL
>>distro gives little reason to not use MyISAM tables except for their
>>ability to perform FULLTEXT searches.
>>
>>
>
>Sorry is this a typo or did you really mean MyISAM? So you wouldn't use
>transactions but let MySQL simply ignore the transaction commands?
>
>
>
>
Yes! If someone has an attachment to the MyISAM table type (for the
HANDLER interface, FULLTEXT indexes, OpenGIS-types in 4.1.0 and higher)
then they are fully aware of things that can happen in applications that
rely on the ROLLBACK statement.

Regards,

Chris

Re: some speed tests [ In reply to ]

xndbmail at xerus

Feb 15, 2004, 8:21 PM

Post #30 of 30 (1338 views)

Permalink

On Sun, Feb 15, 2004 at 09:30:58PM +1100, Chris Nolan wrote:
> Now you've peaked my curiosity!
>
> Do you have any additional details regarding table types used and
> whether or not the same thing occurs in the 4.0.x series?

There was a bug in 4.0.x where x < 15 that caused table corruption in
MyISAM tables.
http://www.vbulletin.com/forum/showthread.php?threadid=69803

xn