Mailing List Archive

[OSSTEST PATCH v2 08/41] sg-report-flight: Ask the db for flights of interest
Specifically, we narrow the initial query to flights which have at
least some job with the built_revision_foo we are looking for.

This condition is strictly broader than that implemented inside the
flight search loop, so there is no functional change.

Perf: runtime of my test case now ~300s-500s.

Example query before (from the Perl DBI trace):

SELECT * FROM (
SELECT flight, blessing FROM flights
WHERE (branch='xen-unstable')
AND EXISTS (SELECT 1
FROM jobs
WHERE jobs.flight = flights.flight
AND jobs.job = ?)

AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
ORDER BY flight DESC
LIMIT 1000
) AS sub
ORDER BY blessing ASC, flight DESC

With these bind variables:

"test-armhf-armhf-libvirt"

After:

SELECT * FROM (
SELECT DISTINCT flight, blessing
FROM flights
JOIN runvars r1 USING (flight)

WHERE (branch='xen-unstable')
AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
AND EXISTS (SELECT 1
FROM jobs
WHERE jobs.flight = flights.flight
AND jobs.job = ?)

AND r1.name LIKE 'built\_revision\_%'
AND r1.name = ?
AND r1.val= ?

ORDER BY flight DESC
LIMIT 1000
) AS sub
ORDER BY blessing ASC, flight DESC

With these bind variables:

"test-armhf-armhf-libvirt"
'built_revision_xen'
'165f3afbfc3db70fcfdccad07085cde0a03c858b'

Diff to the query:

SELECT * FROM (
- SELECT flight, blessing FROM flights
+ SELECT DISTINCT flight, blessing
+ FROM flights
+ JOIN runvars r1 USING (flight)
+
WHERE (branch='xen-unstable')
+ AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
AND EXISTS (SELECT 1
FROM jobs
WHERE jobs.flight = flights.flight
AND jobs.job = ?)

- AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
+ AND r1.name LIKE 'built\_revision\_%'
+ AND r1.name = ?
+ AND r1.val= ?
+
ORDER BY flight DESC
LIMIT 1000
) AS sub

CC: George Dunlap <George.Dunlap@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2: Use proper \ escaping for underscores in LIKE
---
schema/runvars-built-index.sql | 2 +-
sg-report-flight | 64 ++++++++++++++++++++++++++++++++--
2 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/schema/runvars-built-index.sql b/schema/runvars-built-index.sql
index 7108e0af..128e69e9 100644
--- a/schema/runvars-built-index.sql
+++ b/schema/runvars-built-index.sql
@@ -1,4 +1,4 @@
--- ##OSSTEST## 007 Preparatory
+-- ##OSSTEST## 007 Needed
--
-- This index helps sg-report-flight find relevant flights.

diff --git a/sg-report-flight b/sg-report-flight
index 7f2790ce..10127582 100755
--- a/sg-report-flight
+++ b/sg-report-flight
@@ -185,19 +185,77 @@ END
if (defined $job) {
push @flightsq_params, $job;
$flightsq_jobcond = <<END;
- EXISTS (SELECT 1
+ AND EXISTS (SELECT 1
FROM jobs
WHERE jobs.flight = flights.flight
AND jobs.job = ?)
END
}

+ # We build a slightly complicated query to find possibly-relevant
+ # flights. A "possibly-relevant" flight is one which the main
+ # flight categorisation algorithm below (the loop over $tflight)
+ # *might* decide is of interest.
+ #
+ # That algorithm produces a table of which revision(s) of what
+ # %specver trees the build jobs for the relevant test job used.
+ # And then it insists (amongst other things) that for each such
+ # tree the revision in question appears.
+ #
+ # It only looks at build jobs within the flight. So any flight
+ # that the main algorithm finds interesting will have *some* job
+ # (in the same flight) mentioning that revision in a built
+ # revision runvar. So we can search the runvars table by its
+ # index on the revision.
+ #
+ # So we look for flights that have an appropriate entry in runvars
+ # for each %specver tree. We can do this by joining the runvar
+ # table once for each tree.
+ #
+ # The "osstest" tree is handled specially. as ever. (We use
+ # "r$ri" there too for orthogonality of the code, not because
+ # there could be multiple specifiations for the osstest revision.)
+ #
+ # This complex query is an optimisation: for correctness, we must
+ # still execute the full job-specific recursive examination, for
+ # each possibly-relevant flight - that's the $tflight loop body.
+
+ my $runvars_joins = '';
+ my $runvars_conds = '';
+ my $ri=0;
+ foreach my $tree (sort keys %{ $specver{$thisthat} }) {
+ $ri++;
+ if ($tree ne 'osstest') {
+ $runvars_joins .= <<END;
+ JOIN runvars r$ri USING (flight)
+END
+ $runvars_conds .= <<END;
+ AND r$ri.name LIKE 'built\_revision\_%'
+ AND r$ri.name = ?
+ AND r$ri.val= ?
+END
+ push @flightsq_params, "built_revision_$tree",
+ $specver{$thisthat}{$tree};
+ } else {
+ $runvars_joins .= <<END;
+ JOIN flights_harness_touched r$ri USING (flight)
+END
+ $runvars_conds .= <<END;
+ AND r$ri.harness= ?
+END
+ push @flightsq_params, $specver{$thisthat}{$tree};
+ }
+ }
+
my $flightsq= <<END;
SELECT * FROM (
- SELECT flight, blessing FROM flights
+ SELECT DISTINCT flight, blessing
+ FROM flights
+$runvars_joins
WHERE $branches_cond_q
- AND $flightsq_jobcond
AND $blessingscond
+$flightsq_jobcond
+$runvars_conds
ORDER BY flight DESC
LIMIT 1000
) AS sub
--
2.20.1
Re: [OSSTEST PATCH v2 08/41] sg-report-flight: Ask the db for flights of interest [ In reply to ]
> On Jul 31, 2020, at 12:37 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
>
> Specifically, we narrow the initial query to flights which have at
> least some job with the built_revision_foo we are looking for.
>
> This condition is strictly broader than that implemented inside the
> flight search loop, so there is no functional change.

Assuming this is true, that job / runvar is filtered after extracting this information, then...

>
> Perf: runtime of my test case now ~300s-500s.
>
> Example query before (from the Perl DBI trace):
>
> SELECT * FROM (
> SELECT flight, blessing FROM flights
> WHERE (branch='xen-unstable')
> AND EXISTS (SELECT 1
> FROM jobs
> WHERE jobs.flight = flights.flight
> AND jobs.job = ?)
>
> AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
> ORDER BY flight DESC
> LIMIT 1000
> ) AS sub
> ORDER BY blessing ASC, flight DESC
>
> With these bind variables:
>
> "test-armhf-armhf-libvirt"
>
> After:
>
> SELECT * FROM (
> SELECT DISTINCT flight, blessing
> FROM flights
> JOIN runvars r1 USING (flight)
>
> WHERE (branch='xen-unstable')
> AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
> AND EXISTS (SELECT 1
> FROM jobs
> WHERE jobs.flight = flights.flight
> AND jobs.job = ?)
>
> AND r1.name LIKE 'built\_revision\_%'
> AND r1.name = ?
> AND r1.val= ?
>
> ORDER BY flight DESC
> LIMIT 1000
> ) AS sub
> ORDER BY blessing ASC, flight DESC

…I agree that this shoud introduce no other changes.

Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Re: [OSSTEST PATCH v2 08/41] sg-report-flight: Ask the db for flights of interest [ In reply to ]
George Dunlap writes ("Re: [OSSTEST PATCH v2 08/41] sg-report-flight: Ask the db for flights of interest"):
> > On Jul 31, 2020, at 12:37 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> > Specifically, we narrow the initial query to flights which have at
> > least some job with the built_revision_foo we are looking for.
> >
> > This condition is strictly broader than that implemented inside the
> > flight search loop, so there is no functional change.
>
> Assuming this is true, that job / runvar is filtered after extracting this information, then...
...
> …I agree that this shoud introduce no other changes.
>
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>

Thanks.

Just to convince myself, I ran through the argument based on the perl
code. I found a lacuna.

1. The job of findaflight is to find a flight, and it doesn't have
significant side effects - just a return value.

2. If it returns a flight from the loop, $whynot must have been
undef. $whynot is never unset.

Consider some tree in %{ $specver{$thisthat} }.

3. If @revisions is 0 for that tree, $whynot is set. So one of the
two queries $revisionsq or $revisionsosstestq must have returned
some rows.

4. Furthermore, none of those rows must have passed the $wronginfo
grep. If they had, $whynot would have been set. Any row
whose val doesn't contain a colon, and which doesn't end up
in $wronginfo, had a val equal to the requested specver.

5. Colons in this field appear only in mercurial revisions. These are
now obsoelete - we have no mercurial trees. A consequence of this
commit is actually that we should explicitly abolish mercurial
support, at least pending a change to osstest to arrange for the
val column to contain only the hash part and not the number part.

6. Together, these conditons means that if $whynot wasn't set,
there must have been some row whose val matched the specver.

7. Both the $revisionsq and $revisionsosstestq queries take a flight
bound variable condition. This is bound by a value that came out
of @binfos. @binfos is made from %binfos, where the flight number
is the key. %binfos is populated by the @binfos_todo loop, where
it gets the flight number from a @binfos_todos entry - but it
filters them for $bflight == $tflight.

8. So some row must have matched the flight, and the specver, and
of course the name. This is precisely the new condition.

I think this means I should put a commit earlier in this series which
disables mercurial support until the colon version situation is
rationalised.

Ian.