Mailing List Archive: [The Trac Project] #3837: Inefficient sql query

[The Trac Project] #3837: Inefficient sql query

Oct 1, 2006, 2:27 PM

Post #1 of 5 (3204 views)

#3837: Inefficient sql query
--------------------------------+-------------------------------------------
Reporter: andres@anarazel.de | Owner: jonas
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: general | Version: 0.10
Severity: normal | Keywords:
--------------------------------+-------------------------------------------
The sql query in get_youngest_rev_in_cache is quite inefficient and does a
full table scan in the revision table which scales quite bad.
The test repository used for the following numbers is the repository from
crystalspace3d.org which has around 25000 revisions and is 1.5gb big.

I benchmarked with apachebench with 3 times 20 requests. Everything one
with a concurrency of 4 and without. As the concurrency didnt make any
significant difference and as they vary much more over each series of
requests, i didnt wrote the numbers down.

What i requested was the timeline with a history depth of 5 days.
Hardware is a pentium 4 system, with a raid 1 disk system and 1 gig of
ram.

The following numbers are for sqlite. With postgres its very much more
visible, but i thought that sqlite is more wildly used.

Original query:
1.01 requests/s

SELECT MAX(CAST(rev as int)) FROM revision
1.2 requests/s

SELECT MAX(rev_int) FROM revision
with rev_int beeing a column which copied rev as an int and an seperate
index over it. The problem is, that sqlite doesnt seems to support indices
over expressions which eg postgres does.
3.66 requests/s

The effect is also visible when using a history depth of 90 days but the
differences arent _that_ big as before. I didnt really measure those as
its sometimes below 0.11 requests/s ...

The problem with this is, that the CAST() syntax is only supported with
sqlite3 and not 2. Although not tested mysql seems to support that for
some time.

And indices over expressions are not supported in sqlite at all. Which is
a pity because else you could just add an index like CREATE INDEX
revision_rev_asint ON revision ((CAST(rev as int)).
The only even halfway practical solution i currently see is an extra
column + trigger. But thats very hackish.
Any other idea?

--
Ticket URL: <http://trac.edgewall.org/ticket/3837>
The Trac Project <http://trac.edgewall.org/>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Trac Tickets" group.
To post to this group, send email to trac-tickets@googlegroups.com
To unsubscribe from this group, send email to trac-tickets-unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.se/group/trac-tickets
-~----------~----~----~----~------~----~------~--~---

Re: [The Trac Project] #3837: Inefficient sql query [ In reply to ]

noreply at edgewall

Oct 1, 2006, 2:29 PM

Post #2 of 5 (3097 views)

Permalink

Re: [The Trac Project] #3837: Inefficient sql query [ In reply to ]

noreply at edgewall

Oct 1, 2006, 2:50 PM

Post #3 of 5 (3110 views)

Permalink

Re: [The Trac Project] #3837: Inefficient sql query [ In reply to ]

noreply at edgewall

Oct 1, 2006, 7:32 PM

Post #4 of 5 (3104 views)

Permalink

Re: [The Trac Project] #3837: Inefficient sql query [ In reply to ]

noreply at edgewall

Oct 2, 2006, 12:24 AM

Post #5 of 5 (3077 views)

Permalink

#3837: Inefficient sql query
--------------------------------+-------------------------------------------
Reporter: andres@anarazel.de | Owner: cboos
Type: enhancement | Status: new
Priority: normal | Milestone: 0.10.1
Component: version control | Version: 0.10
Severity: normal | Resolution:
Keywords: svn |
--------------------------------+-------------------------------------------
Comment (by cboos):

Well, for 0.10.1, I wanted to put at least the `CAST`, as this is cleaner
than the current hack, and apparently a bit faster.
I also wanted to check if there's really no way with SQLite to create an
index based on a computed value. It looks like we could do it with user-
defined collating sequences.

The first change wouldn't affect the DB at all, the second would create a
new index if `svn` is the `repository_type`. Ideally, this should ''not''
be done by a DB upgrade, since you can decide to set for the first time or
change the `repository_type` at anytime. This should rather be done just
after a `resync`, by the version control specific code. So probably this
is better seen as a TODO item for the VcRefactoring...

About making a DB schema in 0.10.1, we already discussed one: the current
`PRIMARY KEY` constraint on the `node_change` table is problematic for
MySQL (see #3676 and attachment:ticket:3778:node_change_upgrade-
r3787.patch).

--
Ticket URL: <http://trac.edgewall.org/ticket/3837#comment:4>
The Trac Project <http://trac.edgewall.org/>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Trac Tickets" group.
To post to this group, send email to trac-tickets@googlegroups.com
To unsubscribe from this group, send email to trac-tickets-unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.se/group/trac-tickets
-~----------~----~----~----~------~----~------~--~---