Mailing List Archive

Optimizing large story lookups
I'm seeing unusually long running / slow queries in a relatively straightforward template:

my @bio_stories = ref($story)->list({
element_key_name => 'bio_story',
});
# Filter for just those that have the 'national' element checked
my @national_bio_stories = grep { $_->get_data('national') } @bio_stories;
# Sort by last name
@national_bio_stories = sort { lc($a->get_data('last')) cmp lc($b->get_data('last')) } @national_bio_stories;

This seems to create a query that runs forever to lookup those ~1500 stories:

SELECT s.id, s.uuid, s.priority, s.source__id, s.usr__id, s.element_type__id, s.first_publish_date, s.publish_date, s.expire_date, s.current_version, s.published_version, s.workflow__id, s.publish_status, s.primary_uri, s.active, s.desk__id, s.site__id, s.alias_id, i.id, i.name, i.description, i.story__id, i.version, i.usr__id, i.primary_oc__id, i.slug, i.cover_date, i.note, i.checked_out, group_concat( DISTINCT m.grp__id ), group_concat( DISTINCT c.asset_grp_id ), group_concat( DISTINCT w.asset_grp_id )
FROM story_instance i, element_type e, story__contributor sic, story_member sm, member m, story__category sc, category c, workflow w, story s
WHERE s.id = i.story__id AND sm.object_id = s.id AND m.id = sm.member__id AND m.active = '1' AND sc.story_instance__id = i.id AND c.id = sc.category__id AND s.workflow__id = w.id AND s.current_version = i.version AND i.checked_out = ( SELECT checked_out FROM story_instance WHERE version = i.version AND story__id = i.story__id ORDER BY checked_out

I'm also getting a syntax error when I try to run this through ANALYZE EXPLAIN:

ERROR: syntax error at end of input
LINE 3: ...= i.version AND story__id = i.story__id ORDER BY checked_out

Any thoughts on the above greatly appreciated.

Phillip.

--
Phillip Smith
http://phillipadsmith.com
Re: Optimizing large story lookups [ In reply to ]
That is some impressive SQL! There is an unclosed subquery right on the end there:

...snip... AND i.checked_out = ( SELECT checked_out FROM story_instance WHERE version = i.version AND story__id = i.story__id ORDER BY checked_out

There should be a closing parens at the end to finish the subquery. That could be the end of it, or maybe there’s even more to the WHERE clause and it’s getting cut off somewhere?
Re: Optimizing large story lookups [ In reply to ]
> On Jul 10, 2014, at 12:46 PM, Greg Heo <greg@gregheo.com> wrote:
>
>
> There should be a closing parens at the end to finish the subquery. That could be the end of it, or maybe there’s even more to the WHERE clause and it’s getting cut off somewhere?

Doh. Thanks Greg. Will hunt around and report back. :)



Sent from a phone. Please excuse the brevity and typos
Re: Optimizing large story lookups [ In reply to ]
On 2014-07-10, at 12:46 PM, Greg Heo <greg@gregheo.com> wrote:

> That is some impressive SQL! There is an unclosed subquery right on the end there:
>
> ...snip... AND i.checked_out = ( SELECT checked_out FROM story_instance WHERE version = i.version AND story__id = i.story__id ORDER BY checked_out
>
> There should be a closing parens at the end to finish the subquery. That could be the end of it, or maybe there’s even more to the WHERE clause and it’s getting cut off somewhere?

Is there any reason that output from `SELECT current_query FROM pg_stat_activity ;` would truncate the query if it was over a certain length?

I'm stumped here.

Phillip.


--
Phillip Smith
http://phillipadsmith.com
Re: Optimizing large story lookups [ In reply to ]
Hi Phillip,

> Is there any reason that output from `SELECT current_query FROM
> pg_stat_activity ;` would truncate the query if it was over a certain
> length?

It's default behavior. I think you can change this via
track_activity_query_size:

http://www.postgresql.org/docs/8.4/static/runtime-config-statistics.html

Cheers,

Alex
Re: Optimizing large story lookups [ In reply to ]
On 2014-07-11, at 4:02 PM, Phillip Smith <ps@phillipadsmith.com> wrote:

>>
>> On Jul 10, 2014, at 12:46 PM, Greg Heo <greg@gregheo.com> wrote:
>>
>>
>> There should be a closing parens at the end to finish the subquery. That could be the end of it, or maybe there’s even more to the WHERE clause and it’s getting cut off somewhere?
>
> Doh. Thanks Greg. Will hunt around and report back. :)

Okey-dokey! Here we go:

http://explain.depesz.com/s/fjWh

And the full query:

https://gist.github.com/phillipadsmith/ffe86ae125bf94910654

Any help / pointers appreciated.

Phillip.

--
Phillip Smith
http://phillipadsmith.com
http://twitter.com/phillipadsmith
http://linkedin.com/in/phillipadsmith

Toronto: (647) 361-8248
Toll-free: (888) 418-1868

PGP: 25F7 F458 C261 B25B 5297 8F0A 518B F15D D63C 5D49

If your email inbox is out of control, check out http://sanebox.com/t/s0q7m

Save our in-boxes! http://emailcharter.org
Re: Optimizing large story lookups [ In reply to ]
On Jul 14, 2014, at 10:33 AM, Phillip Smith <ps@phillipadsmith.com> wrote:

> Okey-dokey! Here we go:
>
> http://explain.depesz.com/s/fjWh
>
> And the full query:
>
> https://gist.github.com/phillipadsmith/ffe86ae125bf94910654
>
> Any help / pointers appreciated.

Do you have a lot of story instances?

Best,

David
Re: Optimizing large story lookups [ In reply to ]
On 2014-07-14, at 5:22 PM, "David E. Wheeler" <david@justatheory.com> wrote:

> On Jul 14, 2014, at 10:33 AM, Phillip Smith <ps@phillipadsmith.com> wrote:
>
>> Okey-dokey! Here we go:
>>
>> http://explain.depesz.com/s/fjWh
>>
>> And the full query:
>>
>> https://gist.github.com/phillipadsmith/ffe86ae125bf94910654
>>
>> Any help / pointers appreciated.
>
> Do you have a lot of story instances?

Yes, for this one, it would be about 1500 stories that are returned.

It's basically a lookup for all "contributor" stories on a site, to be used for an A-Z listing of people who contribute.

I've been brainstorming other approaches, but haven't come up with anything yet…

Phillip.

--
Phillip Smith
http://phillipadsmith.com
Re: Optimizing large story lookups [ In reply to ]
On Jul 14, 2014, at 2:30 PM, Phillip Smith <ps@phillipadsmith.com> wrote:

> Yes, for this one, it would be about 1500 stories that are returned.
>
> It's basically a lookup for all "contributor" stories on a site, to be used for an A-Z listing of people who contribute.
>
> I've been brainstorming other approaches, but haven't come up with anything yet…

You could write your own query, but it would be tough.

David
Re: Optimizing large story lookups [ In reply to ]
On 2014-07-14, at 5:39 PM, "David E. Wheeler" <david@justatheory.com> wrote:

> On Jul 14, 2014, at 2:30 PM, Phillip Smith <ps@phillipadsmith.com> wrote:
>
>> Yes, for this one, it would be about 1500 stories that are returned.
>>
>> It's basically a lookup for all "contributor" stories on a site, to be used for an A-Z listing of people who contribute.
>>
>> I've been brainstorming other approaches, but haven't come up with anything yet…
>
> You could write your own query, but it would be tough.

Well, the thing is, at the moment, it doesn't appear to complete. It can run for more than 10 minutes without finishing… something's odd about that, no?

Phillip.


--
Phillip Smith
http://phillipadsmith.com
Re: Optimizing large story lookups [ In reply to ]
On Jul 14, 2014, at 2:42 PM, Phillip Smith <ps@phillipadsmith.com> wrote:

>> You could write your own query, but it would be tough.
>
> Well, the thing is, at the moment, it doesn't appear to complete. It can run for more than 10 minutes without finishing… something's odd about that, no?

Oh, how’d you get the plan?

Post it to #postgresql on Freenode.

David