Mailing List Archive: Moving ExecCGI to mod_perl - performance and custom 'modules'

Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT] [ In reply to ]

Feb 12, 2021, 8:15 AM

Post #26 of 28 (619 views)

My comment was just basically so as to avoid the case where someone else would later be
searching the archives of this mailing list for information about DBI, and never find
these (useful for DBI) posts, because DBI is not in the subject.

On 12.02.2021 00:51, Chris wrote:
> On Thu, Feb 11, 2021 at 09:52:16AM +0100, André Warnier (tomcat/perl) wrote:
>> Isn't this discussion about connection pools and firewalls etc getting a bit
>> far from the initial subject of the thread ?
>
> Perhaps. But this has become a pretty low volume mailing list.
> This "thread" has moved me to spend hours looking at changing and/or
> better understanding the work I have done (pretty old code) and the
> work I am now starting.
>
> For me, I'm re-reading the manual pages for the DBI modules,
> etc. I've also added another mailing list to follow about DBI.
>
> And I will now have some threads to add in the near future.
> Threads I wouldn't have thought of.
> But this isn't my mailing list, so breaking these topics into new
> threads is just fine. Not a problem at all. 8-)
>
> Recently, something "clicked on" for me about mod_perl.
> Which is pretty thrilling for me. ;-}
>
> Chris
>
>
>>
>> On 09.02.2021 23:03, Mithun Bhattacharya wrote:
>>> I would consider mine a small setup on an internal network and I have
>>> used both Sybase and SQL Server. In our case the DBA's preferred us to
>>> remain connected rather than make too many connections - we need DB
>>> access in bursts - it could be quiet for more than an hour and then
>>> suddenly we might need hundreds of connections within few minutes (if we
>>> didnt cache it). Another thing was we were connecting from forked
>>> processes so at some point everything gets reaped including the
>>> connections. Our style of coding has been to connect to the DB wherever
>>> we actually need to fire one or more SQLs and do connect_cached in the
>>> actual implementation (it is a separate library since we had to place a
>>> wrapper to acquire credentials)
>>>
>>> On Tue, Feb 9, 2021 at 2:34 PM James Smith <js5@sanger.ac.uk <mailto:js5@sanger.ac.uk>> wrote:
>>>
>>> Mithun,
>>>
>>> I’m not sure on what scale you work – but these are from experience in sites with
>>> small to medium load – and we rarely see an appreciable gain in using cached or pooled
>>> connections, just the occasional heartache they cause.
>>> If you are working on small applications with a minimal number of databases on the DB
>>> server then you may see some performance improvement (but tbh not as much as you used
>>> to – as the servers have changed) Unfortunately I don’t in both my main and secondary
>>> roles, and I know many others who come across these limitations as well.
>>>
>>> I’m not saying don’t use persistent or cached connections – but leaving it to some
>>> hidden layers is not necessarily a good thing to do – it can have unforeseen side
>>> effects {and Apache::DBI & PHP pconnect have both shown these up}
>>>
>>> If you are working with e.g. with MySQL the overhead of the (socket) connection is
>>> very small, but having more connections open to cope with persistent connections
>>> {memory wise} often needs specifying a much large database server – or not being able
>>> to do all the nice tricks to in memory indexes and queries [to increase query
>>> performance]. Being able to chose which connections you keep open and which you
>>> open/close on a per request basis gives you the benefits of caching without the risks
>>> involved [other than the “lock table” issue].____
>>>
>>> __ __
>>>
>>> __ __
>>>
>>> *From:*Mithun Bhattacharya <mithnb@gmail.com <mailto:mithnb@gmail.com>>
>>> *Sent:* 09 February 2021 18:34
>>> *To:* mod_perl list <modperl@perl.apache.org <mailto:modperl@perl.apache.org>>
>>> *Subject:* Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]____
>>>
>>> __ __
>>>
>>> Connection caching does work for most use cases - we have to accept James works in
>>> scenarios most developers can't fathom :) ____
>>>
>>> __ __
>>>
>>> If you are just firing off simple SQL's without any triggers or named temporary tables
>>> involved you should be good. The only times we recall tripping on cached connection is
>>> when two different code snippets tried to create the same temporary table. Another
>>> time the code was expecting the disconnect to complete the connection cleanup.____
>>>
>>> __ __
>>>
>>> On Tue, Feb 9, 2021 at 11:47 AM Vincent Veyron <vv.lists@wanadoo.fr
>>> <mailto:vv.lists@wanadoo.fr>> wrote:____
>>>
>>> On Sun, 7 Feb 2021 20:21:34 +0000
>>> James Smith <js5@sanger.ac.uk <mailto:js5@sanger.ac.uk>> wrote:
>>>
>>> Hi James,
>>>
>>> > DBI sharing doesn't really gain you much - and can actually lead you into a
>>> whole world of pain. It isn't actually worth turning it on at all.
>>> >
>>>
>>> Never had a problem with it myself in years of using it, but I wrap my queries in
>>> an eval { } and check $@, so that the scripts are not left hanging; also I have a
>>> postgresql db ;-).
>>>
>>> I ran some tests with ab, I do see an improvement in response speed :
>>>
>>> my $dbh = DBI->connect()
>>> Concurrency Level: 5
>>> Time taken for tests: 22.198 seconds
>>> Complete requests: 1000
>>> Failed requests: 0
>>> Total transferred: 8435000 bytes
>>> HTML transferred: 8176000 bytes
>>> Requests per second: 45.05 [#/sec] (mean)
>>> Time per request: 110.990 [ms] (mean)
>>> Time per request: 22.198 [ms] (mean, across all concurrent requests)
>>> Transfer rate: 371.08 [Kbytes/sec] received
>>>
>>> my $dbh = DBI->connect_cached()
>>> Concurrency Level: 5
>>> Time taken for tests: 15.133 seconds
>>> Complete requests: 1000
>>> Failed requests: 0
>>> Total transferred: 8435000 bytes
>>> HTML transferred: 8176000 bytes
>>> Requests per second: 66.08 [#/sec] (mean)
>>> Time per request: 75.664 [ms] (mean)
>>> Time per request: 15.133 [ms] (mean, across all concurrent requests)
>>> Transfer rate: 544.33 [Kbytes/sec] received
>>>
>>>
>>> --
>>>
>>> Bien à vous, Vincent Veyron
>>>
>>> https://compta.libremen.com [compta.libremen.com]
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__compta.libremen.com&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=CnIW-j3Bw_IfohZCciiwtkoqvr6nV2hHrNYMPpEOe8E&s=uf6Qi4tnTPryVuPvOKwfZQcFOksecWyn-LYPDVj44lY&e=>
>>> Logiciel libre de comptabilité générale en partie double
>>>
>>>
>>> ____
>>>
>>> -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity
>>> registered in England with number 1021457 and a company registered in England with
>>> number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
>>>
>>

Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT] [ In reply to ]

cpb_mod_perl at bennettconstruction

Feb 12, 2021, 11:10 AM

Post #27 of 28 (619 views)

Permalink

On Fri, Feb 12, 2021 at 05:15:29PM +0100, André Warnier (tomcat/perl) wrote:
> My comment was just basically so as to avoid the case where someone else
> would later be searching the archives of this mailing list for information
> about DBI, and never find these (useful for DBI) posts, because DBI is not
> in the subject.

I can't disagree with that. Put up a short thread for the archives with a good
subject and content about what this thread references with some kind of link
to this thread?

Some of the good topics I really found helpful are also long threads
that wander around getting off-topic of the subject. I've saved quite a
few useful posts from those threads. Good archiving is very important.
I have found using regular web search engines to be an exercise in
useless frustration.

I'm reading up on several of the topics here. I plan to ask some more
questions about those after a bit. What is a good way to reference back
to this thread's sections that will get into the archive in a useful way?

Chris

>
> On 12.02.2021 00:51, Chris wrote:
> > On Thu, Feb 11, 2021 at 09:52:16AM +0100, André Warnier (tomcat/perl) wrote:
> > > Isn't this discussion about connection pools and firewalls etc getting a bit
> > > far from the initial subject of the thread ?
> >
> > Perhaps. But this has become a pretty low volume mailing list.
> > This "thread" has moved me to spend hours looking at changing and/or
> > better understanding the work I have done (pretty old code) and the
> > work I am now starting.
> >
> > For me, I'm re-reading the manual pages for the DBI modules,
> > etc. I've also added another mailing list to follow about DBI.
> >
> > And I will now have some threads to add in the near future.
> > Threads I wouldn't have thought of.
> > But this isn't my mailing list, so breaking these topics into new
> > threads is just fine. Not a problem at all. 8-)
> >
> > Recently, something "clicked on" for me about mod_perl.
> > Which is pretty thrilling for me. ;-}
> >
> > Chris
> >
> >
> > >
> > > On 09.02.2021 23:03, Mithun Bhattacharya wrote:
> > > > I would consider mine a small setup on an internal network and I have
> > > > used both Sybase and SQL Server. In our case the DBA's preferred us to
> > > > remain connected rather than make too many connections - we need DB
> > > > access in bursts - it could be quiet for more than an hour and then
> > > > suddenly we might need hundreds of connections within few minutes (if we
> > > > didnt cache it). Another thing was we were connecting from forked
> > > > processes so at some point everything gets reaped including the
> > > > connections. Our style of coding has been to connect to the DB wherever
> > > > we actually need to fire one or more SQLs and do connect_cached in the
> > > > actual implementation (it is a separate library since we had to place a
> > > > wrapper to acquire credentials)
> > > >
> > > > On Tue, Feb 9, 2021 at 2:34 PM James Smith <js5@sanger.ac.uk <mailto:js5@sanger.ac.uk>> wrote:
> > > >
> > > > Mithun,
> > > >
> > > > I’m not sure on what scale you work – but these are from experience in sites with
> > > > small to medium load – and we rarely see an appreciable gain in using cached or pooled
> > > > connections, just the occasional heartache they cause.
> > > > If you are working on small applications with a minimal number of databases on the DB
> > > > server then you may see some performance improvement (but tbh not as much as you used
> > > > to – as the servers have changed) Unfortunately I don’t in both my main and secondary
> > > > roles, and I know many others who come across these limitations as well.
> > > >
> > > > I’m not saying don’t use persistent or cached connections – but leaving it to some
> > > > hidden layers is not necessarily a good thing to do – it can have unforeseen side
> > > > effects {and Apache::DBI & PHP pconnect have both shown these up}
> > > >
> > > > If you are working with e.g. with MySQL the overhead of the (socket) connection is
> > > > very small, but having more connections open to cope with persistent connections
> > > > {memory wise} often needs specifying a much large database server – or not being able
> > > > to do all the nice tricks to in memory indexes and queries [to increase query
> > > > performance]. Being able to chose which connections you keep open and which you
> > > > open/close on a per request basis gives you the benefits of caching without the risks
> > > > involved [other than the “lock table” issue].____
> > > >
> > > > __ __
> > > >
> > > > __ __
> > > >
> > > > *From:*Mithun Bhattacharya <mithnb@gmail.com <mailto:mithnb@gmail.com>>
> > > > *Sent:* 09 February 2021 18:34
> > > > *To:* mod_perl list <modperl@perl.apache.org <mailto:modperl@perl.apache.org>>
> > > > *Subject:* Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]____
> > > >
> > > > __ __
> > > >
> > > > Connection caching does work for most use cases - we have to accept James works in
> > > > scenarios most developers can't fathom :) ____
> > > >
> > > > __ __
> > > >
> > > > If you are just firing off simple SQL's without any triggers or named temporary tables
> > > > involved you should be good. The only times we recall tripping on cached connection is
> > > > when two different code snippets tried to create the same temporary table. Another
> > > > time the code was expecting the disconnect to complete the connection cleanup.____
> > > >
> > > > __ __
> > > >
> > > > On Tue, Feb 9, 2021 at 11:47 AM Vincent Veyron <vv.lists@wanadoo.fr
> > > > <mailto:vv.lists@wanadoo.fr>> wrote:____
> > > >
> > > > On Sun, 7 Feb 2021 20:21:34 +0000
> > > > James Smith <js5@sanger.ac.uk <mailto:js5@sanger.ac.uk>> wrote:
> > > >
> > > > Hi James,
> > > >
> > > > > DBI sharing doesn't really gain you much - and can actually lead you into a
> > > > whole world of pain. It isn't actually worth turning it on at all.
> > > > >
> > > >
> > > > Never had a problem with it myself in years of using it, but I wrap my queries in
> > > > an eval { } and check $@, so that the scripts are not left hanging; also I have a
> > > > postgresql db ;-).
> > > >
> > > > I ran some tests with ab, I do see an improvement in response speed :
> > > >
> > > > my $dbh = DBI->connect()
> > > > Concurrency Level: 5
> > > > Time taken for tests: 22.198 seconds
> > > > Complete requests: 1000
> > > > Failed requests: 0
> > > > Total transferred: 8435000 bytes
> > > > HTML transferred: 8176000 bytes
> > > > Requests per second: 45.05 [#/sec] (mean)
> > > > Time per request: 110.990 [ms] (mean)
> > > > Time per request: 22.198 [ms] (mean, across all concurrent requests)
> > > > Transfer rate: 371.08 [Kbytes/sec] received
> > > >
> > > > my $dbh = DBI->connect_cached()
> > > > Concurrency Level: 5
> > > > Time taken for tests: 15.133 seconds
> > > > Complete requests: 1000
> > > > Failed requests: 0
> > > > Total transferred: 8435000 bytes
> > > > HTML transferred: 8176000 bytes
> > > > Requests per second: 66.08 [#/sec] (mean)
> > > > Time per request: 75.664 [ms] (mean)
> > > > Time per request: 15.133 [ms] (mean, across all concurrent requests)
> > > > Transfer rate: 544.33 [Kbytes/sec] received
> > > >
> > > >
> > > > --
> > > >
> > > > Bien à vous, Vincent Veyron
> > > >
> > > > https://compta.libremen.com [compta.libremen.com]
> > > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__compta.libremen.com&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=CnIW-j3Bw_IfohZCciiwtkoqvr6nV2hHrNYMPpEOe8E&s=uf6Qi4tnTPryVuPvOKwfZQcFOksecWyn-LYPDVj44lY&e=>
> > > > Logiciel libre de comptabilité générale en partie double
> > > >
> > > >
> > > > ____
> > > >
> > > > -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity
> > > > registered in England with number 1021457 and a company registered in England with
> > > > number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
> > > >
> > >
>

Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [ In reply to ]

bac2bac at bac2bac

Mar 19, 2021, 4:17 PM

Post #28 of 28 (609 views)

Permalink

Porting code from CGI to mod_perl is a fun project.

This is a late reply to your original post but wanted to point out an
excellent resource - the Practical mod_perl book. The link for the book
on the mod_perl site (perl.apache.org) is broken but you can find it at
https://modperl2book.org/mp1/. It was written when mod_perl 1 was in use
but the concepts explained are still valid. Be aware there was some
renaming in mod_perl 2. For example, Apache::Registry became
ModPerl::Registry.

Chapter 6, "Coding with mod_perl in mind" is full of good stuff about
what to watch out for as you refactor your code for mod_perl, including
the problem with __DATA__ that you had (from a later post). In fact, the
whole book is a great read. I highly recommend skimming through it and
reading in detail as you see fit.

On 2/6/2021 4:59 PM, Steven Haigh wrote:
> Hi all,
>
> So for many years I've been slack and writing perl scripts to do various
> things - but never needed more than the normal apache +ExecCGI and
> Template Toolkit.
>
> One of my sites has become a bit more popular, so I'd like to spend a
> bit of time on performance. Currently, I'm seeing ~300-400ms of what I
> believe to be execution time of the script loading, running, and then
> blatting its output to STDOUT and the browser can go do its thing.
>
> I believe most of the delay would be to do with loading perl, its
> modules etc etc
>
> I know that the current trend would be to re-write the entire site in a
> more modern, daemon based solution - and I started down the Mojolicious
> path - but the amount of re-writing to save 1/3rd of a second seems to
> be excessive
>
> Would I be correct in thinking that mod_perl would help in this case?
>
> I did try a basic test, but I have a 'use functions' in all my scripts
> that loads a .pm with some global vars and a lot of common subs - and
> for whatever reason (can't find anything on Google as to why), none of
> the subs are recognised in the main script when loaded via ModPerl::PerlRun.
>
> So throwing it out to the list - am I on the right track? wasting my
> time? or just a simple mistake?
>
> --
> Steven Haigh ???? netwiz@crc.id.au <mailto:netwiz@crc.id.au>????
> https://www.crc.id.au <https://www.crc.id.au/>