Mailing List Archive

publish_another() with QUEUE_PUBLISH_JOBS enabled
Bricoleurs,

I'm having some issues with publish_another() in Bric 2.0.1. Due to
performance reasons, I need to keep QUEUE_PUBLISH_JOBS enabled and use
bric_queued to handle publishing. Unfortunately, it appears most of the
updates to publish_another() between Bric 1.10.x and 2.x only apply if
QUEUE_PUBLISH_JOBS is disabled. Perhaps I don't fully understand it
yet, but it looks like the 'another_queue' process in publish_another()
doesn't quite work when the job(s) in it are not executed immediately
(i.e. bric_queued doesn't have enough information about the burner state
to keep the publish_another() jobs "together").

Here's my problem. My autohandler contains a cleanup block which is
used to trigger a republish of related assets:

> # Make it always return true to disable the triggered publishes.
> return if $burner->get_mode != PUBLISH_MODE || $burner->notes('in_another');
> $burner->notes(in_another => 1);
> $burner->publish_another($_) for ref($story)->list({
> related_story_id => $story->get_id,
> exclude_id => $story->get_id,
> active => 1,
> unexpired => 1,
> published_version => 1,
> });
> $burner->notes(in_another => 0);
> </%cleanup>|

If QUEUE_PUBLISH_JOBS is enabled, this creates an infinite publish loop
when you publish story A which relates to story B, and story B happens
to also have a relation back to story A.

It seems to me this can be fixed with one of the following methods:

1. Implement notes between publish jobs. It looks like this would
have to be added to the 'jobs' table in order for bric_queued to
pick them up, since you can't really pass a burner object.
2. Add an alternate code path to publish_another() in Burner.pm
which, instead of adding each job to 'another_queue' for later
publication, simply calls publish() directly on the asset if
QUEUE_PUBLISH_JOBS is enabled. See the attached patch file for
how I implemented this in my testing environment.

Thoughts? Are either of these potential fixes valuable, or can I do
something in my autohandler to work around this?

-Nick
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Feb 25, 2011, at 7:54 AM, Nick Legg wrote:

> If QUEUE_PUBLISH_JOBS is enabled, this creates an infinite publish loop when you publish story A which relates to story B, and story B happens to also have a relation back to story A.
>
> It seems to me this can be fixed with one of the following methods:
>
> 1. Implement notes between publish jobs. It looks like this would
> have to be added to the 'jobs' table in order for bric_queued to
> pick them up, since you can't really pass a burner object.

Oh, ick.

> 2. Add an alternate code path to publish_another() in Burner.pm
> which, instead of adding each job to 'another_queue' for later
> publication, simply calls publish() directly on the asset if
> QUEUE_PUBLISH_JOBS is enabled. See the attached patch file for
> how I implemented this in my testing environment.

Huh. Doesn't that bring the performance slowdown back?

Best,

David
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On 2/25/2011 11:49 AM, David E. Wheeler wrote:
> On Feb 25, 2011, at 7:54 AM, Nick Legg wrote:
>
>> If QUEUE_PUBLISH_JOBS is enabled, this creates an infinite publish loop when you publish story A which relates to story B, and story B happens to also have a relation back to story A.
>>
>> It seems to me this can be fixed with one of the following methods:
>>
>> 1. Implement notes between publish jobs. It looks like this would
>> have to be added to the 'jobs' table in order for bric_queued to
>> pick them up, since you can't really pass a burner object.
> Oh, ick.
>
Not sure who wrote this, but see the following comment in
Bric::Util::Burner->flush_another_queue:

> # XXX We're passing notes here, and Job::Pub will add them to the
burner it
> # creates, but it does not store them. So if the publish job is in
the future
> # or if QUEUE_PUBLISH_JOBS is enabled, the notes will not be
available to the
> # other burner. We might want to add note serialization and
deserialization to
> # the job class if this becomes a serious problem for anyone.

I'm not sure how this would be done without storing stuff in the DB?
>> 2. Add an alternate code path to publish_another() in Burner.pm
>> which, instead of adding each job to 'another_queue' for later
>> publication, simply calls publish() directly on the asset if
>> QUEUE_PUBLISH_JOBS is enabled. See the attached patch file for
>> how I implemented this in my testing environment.
> Huh. Doesn't that bring the performance slowdown back?
Not that I can tell so far. Weren't our previous performance issues
rooted in Net::SSH2? This quick patch I hacked in is intended to "give
back" the feature we have in Bric 1.10.9 where publish_another()
directly calls publish() on the related asset *and* has control over the
burner object. Unfortunately, this very obviously subverts the nice
de-duplication code provided in Bric 2 - it seems this was designed
without bric_queued in mind? I'll kick off a bulk republish on the
(patched) testing environment to see what happens.
> Best,
>
> David

--
Nick Legg
Information Technology Services
Web Developer
Denison University
740-587-6537
leggn@denison.edu
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Feb 25, 2011, at 9:01 AM, Nick Legg wrote:

> Not sure who wrote this, but see the following comment in Bric::Util::Burner->flush_another_queue:
>
> > # XXX We're passing notes here, and Job::Pub will add them to the burner it
> > # creates, but it does not store them. So if the publish job is in the future
> > # or if QUEUE_PUBLISH_JOBS is enabled, the notes will not be available to the
> > # other burner. We might want to add note serialization and deserialization to
> > # the job class if this becomes a serious problem for anyone.

Sounds like me.

> I'm not sure how this would be done without storing stuff in the DB?

It couldn't. It probably wouldn't be hard.

> Not that I can tell so far. Weren't our previous performance issues rooted in Net::SSH2? This quick patch I hacked in is intended to "give back" the feature we have in Bric 1.10.9 where publish_another() directly calls publish() on the related asset *and* has control over the burner object. Unfortunately, this very obviously subverts the nice de-duplication code provided in Bric 2 - it seems this was designed without bric_queued in mind? I'll kick off a bulk republish on the (patched) testing environment to see what happens.

Great, let us know.

David
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Feb 25, 2011, at 10:54 AM, Nick Legg wrote:

> Unfortunately, it appears most of the updates to publish_another() between Bric 1.10.x and 2.x only apply if QUEUE_PUBLISH_JOBS is disabled.

I do not think this is the case. I have been using 2.0 with publish_another and bric_queued happily for months. Rather, I think publish_another does not harmonize across multiple burns.

A simple publish_another will not kick off a second burn instance, but looking at your code, I believe that is what is happening here. So you wind up spawning a burn which does publish_another, which spawns another burn, which spawns publish another, and so on.

I remember when Faith and I wrote this code to pass things between burn instances, but without the fuller template context, I'm going to struggle to remember why we did it in that way, other than to prevent mass publish_another on preview. If that is the only reason, I believe you could more elegantly code it to get rid of the whole note passing business, as opposed to what you've done by rolling back the 2.0 burn changes.

I'll also add that I've recently tended to do stuff like this in the story template as opposed to the autohandler. YMMV.

> Huh. Doesn't that bring the performance slowdown back?
>Not that I can tell so far. Weren't our previous performance issues rooted in Net::SSH2?

Well, yes and no. If you don't turn on QUEUE_PUBLISH_JOBS, then the Bric process serving the user interface will happily wait around for a publish job to finish running before letting the user do anything else, providing the illusion of slowness. As you can imagine, this is evil for large publishing jobs. BRIC_QUEUED should probably run by default, be integrated into the start and stop scripts, and be configured in bricolage.conf.

-Matt
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On 2/26/2011 12:45 PM, Matthew Rolf wrote:
> On Feb 25, 2011, at 10:54 AM, Nick Legg wrote:
>> Unfortunately, it appears most of the updates to publish_another() between Bric 1.10.x and 2.x only apply if QUEUE_PUBLISH_JOBS is disabled.
> I do not think this is the case. I have been using 2.0 with publish_another and bric_queued happily for months. Rather, I think publish_another does not harmonize across multiple burns.
I don't yet understand how that is possible. As far as I can tell, the
process for publishing Story A (where Story A relates to Story B and
Story B relates back to Story A) is this:

1. User publishes Story A, which creates a Publish job in the database
2. bric_queued picks up the Publish job for Story A and executes it
3. The burner, while working on Story A, runs publish_another() in
the template and adds Story B to the another_queue
4. bric_queued flushes the another_queue, which adds a Publish job
for Story B to the database
* note: another_queue is now empty
5. bric_queued picks up the Publish job for Story B and executes it
6. The burner, while working on Story B, runs publish_another() in
the template and adds Story A to the another_queue
7. bric_queued flushes the another_queue, which adds a Publish job
for Story A to the database
8. Go back to step 2

That process seems fishy. Can someone point out what I am missing?
> A simple publish_another will not kick off a second burn instance, but looking at your code, I believe that is what is happening here. So you wind up spawning a burn which does publish_another, which spawns another burn, which spawns publish another, and so on.
It seems that publish_another merely adds a Business Asset to the
another_queue. It then becomes bric_queued's job to flush the
another_queue (this creates a Publish job in the database), then later
pick up the Publish job from the database and execute it. That *must*
be a separate burn instance, unless I'm missing a step somewhere?

The hack I added to Burner.pm *prevents* this 'loop' behavior (if you
properly use the 'notes' field) by forcing an immediate publication of
the 2nd story.
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Feb 26, 2011, at 9:45 AM, Matthew Rolf wrote:

> A simple publish_another will not kick off a second burn instance, but looking at your code, I believe that is what is happening here. So you wind up spawning a burn which does publish_another, which spawns another burn, which spawns publish another, and so on.

Yes, that's the difference between pubish_another in 1.10 vs. 2.0. 2.0 always schedules a publish job, which, when executed, will create a new burner object.

> I remember when Faith and I wrote this code to pass things between burn instances, but without the fuller template context, I'm going to struggle to remember why we did it in that way, other than to prevent mass publish_another on preview. If that is the only reason, I believe you could more elegantly code it to get rid of the whole note passing business, as opposed to what you've done by rolling back the 2.0 burn changes.

That would be the best solution for this case.

> I'll also add that I've recently tended to do stuff like this in the story template as opposed to the autohandler. YMMV.

I often call publish_another in the autohandler, but that should matter to this discussion.

>> Huh. Doesn't that bring the performance slowdown back?
>> Not that I can tell so far. Weren't our previous performance issues rooted in Net::SSH2?
>
> Well, yes and no. If you don't turn on QUEUE_PUBLISH_JOBS, then the Bric process serving the user interface will happily wait around for a publish job to finish running before letting the user do anything else, providing the illusion of slowness. As you can imagine, this is evil for large publishing jobs. BRIC_QUEUED should probably run by default, be integrated into the start and stop scripts, and be configured in bricolage.conf.

I believe it is for most organizations. Perhaps we should consider making that the default…

David
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Feb 28, 2011, at 7:53 AM, Nick Legg wrote:

> I don't yet understand how that is possible. As far as I can tell, the process for publishing Story A (where Story A relates to Story B and Story B relates back to Story A) is this:
>
> 1. User publishes Story A, which creates a Publish job in the database
> 2. bric_queued picks up the Publish job for Story A and executes it
> 3. The burner, while working on Story A, runs publish_another() in
> the template and adds Story B to the another_queue
> 4. bric_queued flushes the another_queue, which adds a Publish job
> for Story B to the database
> * note: another_queue is now empty
> 5. bric_queued picks up the Publish job for Story B and executes it
> 6. The burner, while working on Story B, runs publish_another() in
> the template and adds Story A to the another_queue
> 7. bric_queued flushes the another_queue, which adds a Publish job
> for Story A to the database
> 8. Go back to step 2
>
> That process seems fishy. Can someone point out what I am missing?

You got it exactly right. The problem how to deal with circular publishes. You had been using notes to avoid that, but the queued nature of publish_another() in 2.0 disallows that workaround. Of course, it was always a workaround. There needs to be a better fix, and I don't think that propagating notes is that fix.

> It seems that publish_another merely adds a Business Asset to the another_queue. It then becomes bric_queued's job to flush the another_queue (this creates a Publish job in the database), then later pick up the Publish job from the database and execute it. That *must* be a separate burn instance, unless I'm missing a step somewhere?

Nope, that's exactly right.

> The hack I added to Burner.pm *prevents* this 'loop' behavior (if you properly use the 'notes' field) by forcing an immediate publication of the 2nd story.

And that's a fine intermediate workaround, but there needs to be a better solution.

I think the use of circular relationships is questionable, but I know that it will come up, even if you don't design for it, because users might do it without realizing. Often the best way to deal with it is to call publish_another() from only one side of that circular pattern, but knowing which side is which is not always easy to recognize in templates.

One way to do is is to query recent jobs and not republish a story if it was published within, say, the last 5 minutes. That's not very satisfying, though.

I think that, longer term, the way to go is to have some way to group publishes. So that if you publish a story, anything passed to publish_another() is added to the group. This is essentially how another_queue works, but because the groups would be stored, they'd persist for the duration of the publish and all jobs spawned by publish_another. That is, publish_another() would become less transient. So if you published story A, and it calls publish_another(story b), and story b calls publish_another(story a), story a would not be published then, because it was part of the same publish group as story b.

That make sense?

Best,

David
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
Quoting Nick Legg <leggn@denison.edu>:

> I don't yet understand how that is possible.

Well, my publishing behavior is a little different than yours. I'm
using keywords to build an array of related stories, and then run
publish_another on those stories. Those stories do *not* call publish
another in any way, shape or form.

Where I might run into your issue is during a bulk publish, where
keyword indexes are related to multiple stories. But since the
indexes are already being published as part of the job,
publish_another doesn't spawn any more jobs.

> The hack I added to Burner.pm *prevents* this 'loop' behavior (if
> you properly use the 'notes' field) by forcing an immediate
> publication of the 2nd story.

That makes some sense, nor should it necessarily introduce performance
issues (particularly if you're already on bric_queued). David's
suggestion seems a little more robust, though.

Quick question - what are you using the republish functionality with
the notes for? Doesn't Bric already provide functionality for
republishing related assets?

Something that *might* work in templates is offloading the
publish_another to a utility template. You could pass it the array of
related assets and then have that template flip an attribute in the
published stories that would stop the utility template from running.

-Matt
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
Quoting "David E. Wheeler" <david@kineticode.com>:

> Yes, that's the difference between pubish_another in 1.10 vs. 2.0.
> 2.0 always schedules a publish job, which, when executed, will
> create a new burner object.

I see.

> I believe it is for most organizations. Perhaps we should consider
> making that the default.

I don't think this would be a bad idea.

-Matt
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
Hi everybody,

I'm glad this conversation is happening now, because I could use some
advice, or at least opinions.

IFEX is about to upgrade to Bricolage 2.0.1. They have a whole lot of
publish_another activity that will need to adapt to the new model.

Here's the basic idea:

1. An event happens. (A journalist gets arrested, for example.) IFEX
publishes an alert. Let's call this story the "anchor."

2. The journalist is charged with insulting the king. IFEX publishes
another alert, with the anchor as its related story.

3. The journalist is convicted. IFEX publishes yet another alert, which
also has the anchor as its related story.

4. Etc. As more things happen to the journalist, each update points to
the anchor.


On the right-hand column of anchor and all the updates, we see a list of
the "relation chain," which is teasers for the anchor plus everything
that points to the anchor. (The exception is that it does not show a
teaser for the current story. So you don't see a teaser for the anchor
while you're reading the anchor.)

Here's an example anchor:

http://www.ifex.org/egypt/2011/01/26/egypt_demonstrations/

You'll see the relation chain under "Updates to this story" in the right
column. If you click any of those teasers, you'll see the same relation
chain under "More on this case."

So whenever an anchor or anything that points to an anchor is published,
everything in the relation chain has to be republished too.

We've used burner notes just like Nick did. The story triggering the
publish calls publish_another on the relation chain, then sets a note,
so that none of the members of the chain call publish_another
themselves.

Nick's patch looks like an immediate fix for an imminent upgrade. And I
like David's idea of publish groups. Does this scenario give anybody any
other clever ideas for handling chains?


Thanks so much,

Bret






On Mon, 2011-02-28 at 15:30 -0500, rolfm@denison.edu wrote:
> Quoting Nick Legg <leggn@denison.edu>:
>
> > I don't yet understand how that is possible.
>
> Well, my publishing behavior is a little different than yours. I'm
> using keywords to build an array of related stories, and then run
> publish_another on those stories. Those stories do *not* call publish
> another in any way, shape or form.
>
> Where I might run into your issue is during a bulk publish, where
> keyword indexes are related to multiple stories. But since the
> indexes are already being published as part of the job,
> publish_another doesn't spawn any more jobs.
>
> > The hack I added to Burner.pm *prevents* this 'loop' behavior (if
> > you properly use the 'notes' field) by forcing an immediate
> > publication of the 2nd story.
>
> That makes some sense, nor should it necessarily introduce performance
> issues (particularly if you're already on bric_queued). David's
> suggestion seems a little more robust, though.
>
> Quick question - what are you using the republish functionality with
> the notes for? Doesn't Bric already provide functionality for
> republishing related assets?
>
> Something that *might* work in templates is offloading the
> publish_another to a utility template. You could pass it the array of
> related assets and then have that template flip an attribute in the
> published stories that would stop the utility template from running.
>
> -Matt
>
>
>

--
Bret Dawson
Producer
Pectopah Productions Inc.
(416) 895-7635
bret@pectopah.com
www.pectopah.com
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
Bret Dawson wrote:
> Hi everybody,
>
> I'm glad this conversation is happening now, because I could use some
> advice, or at least opinions.

Let me throw one, too. :)

Why is publishing related media any different from publishing related stories?
This way the problem would be the same as multiple publishing of the same
story which is already solved.

Regards, Zdravko
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Feb 28, 2011, at 10:24 PM, Zdravko Balorda wrote:

> Let me throw one, too. :)
>
> Why is publishing related media any different from publishing related stories?
> This way the problem would be the same as multiple publishing of the same
> story which is already solved.

They'e not any different. AFAIK, if you have PUBLISH_RELATED_ASSETS enabled, both related media and related stories are scheduled for publication at the same time as the story with which they're associated, unless they haven't been changed since the last time they were published.

The only difference between stories and media is that stories are pushed through templates and media are not. And Paul Orrock was working at one time on eliminating that difference, too.

Have I missed something?

Best,

David
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
David E. Wheeler wrote:

> Have I missed something?
>

Obviously I have missed something. I've seen a message like "publishing
related media", but not for stories, so I assumed wrongly. In that case
this topic is more about error in programming. Infinite loop via publish_another()
is very close to while(true){}. No system can fix that. :)

Regards, Zdravko.
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
So some more info to throw on the fire here, I'm seeing duplication of publish jobs too since I upgraded to 2.0.1. My index pages are published once via bulk, and then perhaps once for having seen "publish_another." The publish goes all the way through and then all the indexes show up for a second time after everything else has run.

I'm certain this is a new occurrence, but I'm not sure what could account for it.

Also, from October stories have increased from 800-900, but burn time has gone up from about 5 to 10-15. My DB has swelled to all of 28MB (on a 2GB RAM MacMini all by its lonesome). The single Postgres process Bric decides to engage reaches 45MB, and spends most of its time between 80-90% in top when the index pages are being burned. Is it safe to assume I'm processor or bus bound here, as everything should be in memory? Is it normal for bric_queued to use a single postgres process almost exclusively?

For the record, I'm running a query that looks like this for the index stories:

@slist => Bric::Biz::Asset::Business::Story->list({element_key_name => 'web_blog_entry', unexpired => 1, publish_status => 1, keyword => $keyword, Order => 'cover_date', OrderDirection => 'DESC'})

And pulling out titles and uri's mostly. The sitemap, which also runs slow, is another good example:

<%args>
@slist => Bric::Biz::Asset::Business::Story->list({publish_status => 1, unexpired => 1})
</%args>
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
% foreach (@slist){
<url><loc><% $_->get_uri %></loc><lastmod><% $_->get_publish_date("%Y-%m-%d") %></lastmod></url>
%}
</urlset>

Anyway, I think I finally understand what David was talking about here in reference to the DB load.

-Matt
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Mar 2, 2011, at 9:08 PM, Matthew Rolf wrote:

> Also, from October stories have increased from 800-900, but burn time has gone up from about 5 to 10-15. My DB has swelled to all of 28MB (on a 2GB RAM MacMini all by its lonesome). The single Postgres process Bric decides to engage reaches 45MB, and spends most of its time between 80-90% in top when the index pages are being burned. Is it safe to assume I'm processor or bus bound here, as everything should be in memory? Is it normal for bric_queued to use a single postgres process almost exclusively?

No, I don' think so. I suggest turning on slow query logging to see if any particular queries are killing your performance. Set log_min_duration_statement to 500 and restart PostgreSQL and watch its log to see queries that run for more than half a second.

> For the record, I'm running a query that looks like this for the index stories:
>
> @slist => Bric::Biz::Asset::Business::Story->list({element_key_name => 'web_blog_entry', unexpired => 1, publish_status => 1, keyword => $keyword, Order => 'cover_date', OrderDirection => 'DESC'})
>
> And pulling out titles and uri's mostly. The sitemap, which also runs slow, is another good example:
>
> <%args>
> @slist => Bric::Biz::Asset::Business::Story->list({publish_status => 1, unexpired => 1})

Oh. That will kick the crap out of it for sure. Why not paginate?

Best,

David
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
Quoting "David E. Wheeler" <david@kineticode.com>:

>> Is it safe to assume I'm processor or bus bound here, as everything
>> should be in memory? Is it normal for bric_queued to use a single
>> postgres process almost exclusively?
>
> No, I don' think so.

I assume you're answering my second question here? If so, let me be
clear that if I start migrating around the UI, Bric does grab other
Postgres process and have them do things. But bric_queued makes just
one friend and stays with it.

> I suggest turning on slow query logging to see if any particular
> queries are killing your performance. Set log_min_duration_statement
> to 500 and restart PostgreSQL and watch its log to see queries that
> run for more than half a second.

Slow queries don't seem to be a huge issue here. In the course of a
bulk publish, it caught 5 commits. Second time through, there were
none, I assume because things were cached.

I will definitely take another look at my conf file and see if I can't
tune it a little more. Question - for queries like the ones I've
listed, are the table scans sequential or random? I assume sequential.

> Oh. That will kick the crap out of it for sure. Why not paginate?

I hadn't considered that. How exactly would that help in this case?
Doesn't the DB still have to round up all the stories regardless of
how many pages they get put into?

Thanks for the time,

Matt
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
Responding to myself, here:

On Mar 3, 2011, at 12:08 AM, Matthew Rolf wrote:

> Also, from October stories have increased from 800-900, but burn time has gone up from about 5 to 10-15. My DB has swelled to all of 28MB (on a 2GB RAM MacMini all by its lonesome). The single Postgres process Bric decides to engage reaches 45MB, and spends most of its time between 80-90% in top when the index pages are being burned. Is it safe to assume I'm processor or bus bound here, as everything should be in memory? Is it normal for bric_queued to use a single postgres process almost exclusively?

So I went back and did some DB tuning to see if I couldn't straighten things out. Increasing work memory from 8MB to 12MB seemed to improve things a little bit. But the biggest publishing performance increase came from doing the DELETEs recommended in the DBA tuning section of the API:

DELETE FROM member
WHERE class__id IN (54, 79, 80)
AND id NOT IN (
SELECT member__id
FROM job_member, job
WHERE job.id = job_member.object_id
AND (
executing = true
OR comp_time IS NULL
)
);

DELETE FROM job
WHERE executing = false
AND (
comp_time IS NOT NULL
OR failed = true
);

After running these queries (which each deleted about 71,000 records) I ran a bulk publish. The postgres process being used by bric_queued went from 90% to 30-40% in top, bric_queued usage on the bricolage box went from 10% to 54%, and publish time dropped from 20 minutes to 5. iostat showed no more than 3MB/second of usage, which was more than it showed at any time before the delete.

So what exactly is Bric doing with that jobs table that causes it to eat processor so badly? After just 6 months of use on a relatively small, single site, that table growth resulted in a 3x slowdown in publishing speed, and took up the majority of processor use.

Anything else out there that should be periodically cleaned out and might have a similar impact?

Thanks,

-Matt
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Mar 5, 2011, at 12:59 PM, Matthew Rolf wrote:

> After running these queries (which each deleted about 71,000 records) I ran a bulk publish. The postgres process being used by bric_queued went from 90% to 30-40% in top, bric_queued usage on the bricolage box went from 10% to 54%, and publish time dropped from 20 minutes to 5. iostat showed no more than 3MB/second of usage, which was more than it showed at any time before the delete.

Not to mention it wiped about 6MB off the database.

-Matt
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Mar 4, 2011, at 8:19 AM, rolfm@denison.edu wrote:

> Quoting "David E. Wheeler" <david@kineticode.com>:
>
>>> Is it safe to assume I'm processor or bus bound here, as everything should be in memory? Is it normal for bric_queued to use a single postgres process almost exclusively?
>>
>> No, I don' think so.
>
> I assume you're answering my second question here? If so, let me be clear that if I start migrating around the UI, Bric does grab other Postgres process and have them do things. But bric_queued makes just one friend and stays with it.

No, I mean everything should not be in memory. But now that you ask, bric_queued should not run in a single process, but a forking process. After it finishes a publish, a fork should exit.

> Slow queries don't seem to be a huge issue here. In the course of a bulk publish, it caught 5 commits. Second time through, there were none, I assume because things were cached.
>
> I will definitely take another look at my conf file and see if I can't tune it a little more. Question - for queries like the ones I've listed, are the table scans sequential or random? I assume sequential.

Sequential according to what? They're read sequentially as they're stored on disk, but you cannot predict what that order might be unless you run CLUSTER -- and even then, things won't *stay* in that order.

http://www.postgresql.org/docs/current/static/sql-cluster.html

>> Oh. That will kick the crap out of it for sure. Why not paginate?
>
> I hadn't considered that. How exactly would that help in this case? Doesn't the DB still have to round up all the stories regardless of how many pages they get put into?

You're loading every story in the system into Memory. That's incredibly resource-intensive -- especially if you then fetch elements from each of those stories. Each of those element fetches will then execute one or more additional queries.

For that pattern, it's better to use LIMIT/OFFSET to fetch only x stories at a time. That way you don't kill your memory (and you don't have near enough allocated to PostgreSQL not to swap).

Best,

David
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
On Mar 5, 2011, at 9:59 AM, Matthew Rolf wrote:

> After running these queries (which each deleted about 71,000 records) I ran a bulk publish. The postgres process being used by bric_queued went from 90% to 30-40% in top, bric_queued usage on the bricolage box went from 10% to 54%, and publish time dropped from 20 minutes to 5. iostat showed no more than 3MB/second of usage, which was more than it showed at any time before the delete.

That just means that some index needs to be there that isn't. Also, did you try running vacuum or analyze on the database? Do you have autovacuum running?

> So what exactly is Bric doing with that jobs table that causes it to eat processor so badly? After just 6 months of use on a relatively small, single site, that table growth resulted in a 3x slowdown in publishing speed, and took up the majority of processor use.

I'd need to see a query plan from a query that's eating that time in order to answer your question. What indexes do you have on the job table?

> Anything else out there that should be periodically cleaned out and might have a similar impact?

You shouldn't have to do that at all, frankly.

Best,

David
Re: publish_another() with QUEUE_PUBLISH_JOBS enabled [ In reply to ]
Matt,

I (finally) ran these two DELETE queries on a test environment to see if
I could replicate the behavior you observed. Our publishing performance
was unaffected.

Our indexes look correct and auto-vacuum is enabled.

-Nick

On 3/5/2011 12:59 PM, Matthew Rolf wrote:
> Responding to myself, here:
>
> On Mar 3, 2011, at 12:08 AM, Matthew Rolf wrote:
>
>> Also, from October stories have increased from 800-900, but burn time has gone up from about 5 to 10-15. My DB has swelled to all of 28MB (on a 2GB RAM MacMini all by its lonesome). The single Postgres process Bric decides to engage reaches 45MB, and spends most of its time between 80-90% in top when the index pages are being burned. Is it safe to assume I'm processor or bus bound here, as everything should be in memory? Is it normal for bric_queued to use a single postgres process almost exclusively?
> So I went back and did some DB tuning to see if I couldn't straighten things out. Increasing work memory from 8MB to 12MB seemed to improve things a little bit. But the biggest publishing performance increase came from doing the DELETEs recommended in the DBA tuning section of the API:
>
> DELETE FROM member
> WHERE class__id IN (54, 79, 80)
> AND id NOT IN (
> SELECT member__id
> FROM job_member, job
> WHERE job.id = job_member.object_id
> AND (
> executing = true
> OR comp_time IS NULL
> )
> );
>
> DELETE FROM job
> WHERE executing = false
> AND (
> comp_time IS NOT NULL
> OR failed = true
> );
>
> After running these queries (which each deleted about 71,000 records) I ran a bulk publish. The postgres process being used by bric_queued went from 90% to 30-40% in top, bric_queued usage on the bricolage box went from 10% to 54%, and publish time dropped from 20 minutes to 5. iostat showed no more than 3MB/second of usage, which was more than it showed at any time before the delete.
>
> So what exactly is Bric doing with that jobs table that causes it to eat processor so badly? After just 6 months of use on a relatively small, single site, that table growth resulted in a 3x slowdown in publishing speed, and took up the majority of processor use.
>
> Anything else out there that should be periodically cleaned out and might have a similar impact?
>
> Thanks,
>
> -Matt