Mailing List Archive

cache a object in modperl
Hello

I am not so familiar with modperl.

For work requirement, I need to access IANA TLD database.

So I wrote this perl module:
https://metacpan.org/pod/Net::IANA::TLD

But, for each new() in the module, the database file will be downloaded
from IANA's website.

I know this is pretty Inefficient.

My question is, can I cache the new'ed object by modperl?

If so, how to do?

Thanks.
Re: cache a object in modperl [ In reply to ]
Your cache would have to be independent of mod_perl - I would suggest
saving to a REDIS instance ?

On Sun, Sep 13, 2020 at 9:51 PM Wesley Peng <wpeng@pobox.com> wrote:

> Hello
>
> I am not so familiar with modperl.
>
> For work requirement, I need to access IANA TLD database.
>
> So I wrote this perl module:
> https://metacpan.org/pod/Net::IANA::TLD
>
> But, for each new() in the module, the database file will be downloaded
> from IANA's website.
>
> I know this is pretty Inefficient.
>
> My question is, can I cache the new'ed object by modperl?
>
> If so, how to do?
>
> Thanks.
>
Re: cache a object in modperl [ In reply to ]
If the database doesn't change very often, and you don't mind only
getting updates to your database when you restart apache, and you're
using prefork mod_perl, then you could use a startup.pl to load your
database before apache forks, and get a shared copy globally in all your
apache children.

https://perl.apache.org/docs/1.0/guide/config.html#The_Startup_File

This thread from 13 years ago seems to have a clear-ish example of how
to use startup.pl to do what i'm talking about.

If you need it to update more frequently than when you restart apache,
you could potentially use a PerlChildInitHandler to load the data when
apache creates children.  This will use more memory, as each child will
have it's own copy, and can also result in situation where children can
have different versions of the database loaded and be serving requests
at the same time.  If you want to go this way you might want to also add
a MaxRequestsPerChild directive to your apache config to make sure that
you're children die and get refreshed on the regular, if you don't
already have one.

Adam


On 9/13/2020 10:51 PM, Wesley Peng wrote:
> Hello
>
> I am not so familiar with modperl.
>
> For work requirement, I need to access IANA TLD database.
>
> So I wrote this perl module:
> https://metacpan.org/pod/Net::IANA::TLD
>
> But, for each new() in the module, the database file will be
> downloaded from IANA's website.
>
> I know this is pretty Inefficient.
>
> My question is, can I cache the new'ed object by modperl?
>
> If so, how to do?
>
> Thanks.
Re: cache a object in modperl [ In reply to ]
That's great. Thank you Adam.

Adam Prime wrote:
> If the database doesn't change very often, and you don't mind only
> getting updates to your database when you restart apache, and you're
> using prefork mod_perl, then you could use a startup.pl to load your
> database before apache forks, and get a shared copy globally in all your
> apache children.
>
> https://perl.apache.org/docs/1.0/guide/config.html#The_Startup_File
>
> This thread from 13 years ago seems to have a clear-ish example of how
> to use startup.pl to do what i'm talking about.
>
> If you need it to update more frequently than when you restart apache,
> you could potentially use a PerlChildInitHandler to load the data when
> apache creates children.  This will use more memory, as each child will
> have it's own copy, and can also result in situation where children can
> have different versions of the database loaded and be serving requests
> at the same time.  If you want to go this way you might want to also add
> a MaxRequestsPerChild directive to your apache config to make sure that
> you're children die and get refreshed on the regular, if you don't
> already have one.
>
> Adam
>
>
> On 9/13/2020 10:51 PM, Wesley Peng wrote:
>> Hello
>>
>> I am not so familiar with modperl.
>>
>> For work requirement, I need to access IANA TLD database.
>>
>> So I wrote this perl module:
>> https://metacpan.org/pod/Net::IANA::TLD
>>
>> But, for each new() in the module, the database file will be
>> downloaded from IANA's website.
>>
>> I know this is pretty Inefficient.
>>
>> My question is, can I cache the new'ed object by modperl?
>>
>> If so, how to do?
>>
>> Thanks.
Re: cache a object in modperl [ In reply to ]
I left out the link to the thread. Here it is.

https://marc.info/?t=119062870700002&r=1&w=2



> On Sep 14, 2020, at 1:18 AM, Wesley Peng <wpeng@pobox.com> wrote:
>
> ?That's great. Thank you Adam.
>
> Adam Prime wrote:
>> If the database doesn't change very often, and you don't mind only getting updates to your database when you restart apache, and you're using prefork mod_perl, then you could use a startup.pl to load your database before apache forks, and get a shared copy globally in all your apache children.
>> https://perl.apache.org/docs/1.0/guide/config.html#The_Startup_File
>> This thread from 13 years ago seems to have a clear-ish example of how to use startup.pl to do what i'm talking about.
>> If you need it to update more frequently than when you restart apache, you could potentially use a PerlChildInitHandler to load the data when apache creates children. This will use more memory, as each child will have it's own copy, and can also result in situation where children can have different versions of the database loaded and be serving requests at the same time. If you want to go this way you might want to also add a MaxRequestsPerChild directive to your apache config to make sure that you're children die and get refreshed on the regular, if you don't already have one.
>> Adam
>>> On 9/13/2020 10:51 PM, Wesley Peng wrote:
>>> Hello
>>>
>>> I am not so familiar with modperl.
>>>
>>> For work requirement, I need to access IANA TLD database.
>>>
>>> So I wrote this perl module:
>>> https://metacpan.org/pod/Net::IANA::TLD
>>>
>>> But, for each new() in the module, the database file will be downloaded from IANA's website.
>>>
>>> I know this is pretty Inefficient.
>>>
>>> My question is, can I cache the new'ed object by modperl?
>>>
>>> If so, how to do?
>>>
>>> Thanks.
Re: cache a object in modperl [ In reply to ]
Startup is not a great idea if your webserver is up forever - I have some
which are running for months.

How frequently do you wish to refresh the cache ? if you do in startup then
your cache refresh is tied to the service restart which might not be ideal
or feasible.

On Mon, Sep 14, 2020 at 12:26 AM Adam Prime <adam.prime@utoronto.ca> wrote:

> I left out the link to the thread. Here it is.
>
> https://marc.info/?t=119062870700002&r=1&w=2
>
>
>
> On Sep 14, 2020, at 1:18 AM, Wesley Peng <wpeng@pobox.com> wrote:
>
> ?That's great. Thank you Adam.
>
> Adam Prime wrote:
>
> If the database doesn't change very often, and you don't mind only getting
> updates to your database when you restart apache, and you're using prefork
> mod_perl, then you could use a startup.pl to load your database before
> apache forks, and get a shared copy globally in all your apache children.
>
> https://perl.apache.org/docs/1.0/guide/config.html#The_Startup_File
>
> This thread from 13 years ago seems to have a clear-ish example of how to
> use startup.pl to do what i'm talking about.
>
> If you need it to update more frequently than when you restart apache, you
> could potentially use a PerlChildInitHandler to load the data when apache
> creates children. This will use more memory, as each child will have it's
> own copy, and can also result in situation where children can have
> different versions of the database loaded and be serving requests at the
> same time. If you want to go this way you might want to also add a
> MaxRequestsPerChild directive to your apache config to make sure that
> you're children die and get refreshed on the regular, if you don't already
> have one.
>
> Adam
>
> On 9/13/2020 10:51 PM, Wesley Peng wrote:
>
> Hello
>
>
> I am not so familiar with modperl.
>
>
> For work requirement, I need to access IANA TLD database.
>
>
> So I wrote this perl module:
>
> https://metacpan.org/pod/Net::IANA::TLD
>
>
> But, for each new() in the module, the database file will be downloaded
> from IANA's website.
>
>
> I know this is pretty Inefficient.
>
>
> My question is, can I cache the new'ed object by modperl?
>
>
> If so, how to do?
>
>
> Thanks.
>
>
Re: cache a object in modperl [ In reply to ]
Hello

Mithun Bhattacharya wrote:
> How frequently do you wish to refresh the cache ? if you do in startup
> then your cache refresh is tied to the service restart which might not
> be ideal or feasible.

I saw recent days IANA has updated their database on date of:

2020.09.09
2020.09.13

So I assume they will update the DB file in few days.

Regards.
Re: cache a object in modperl [ In reply to ]
So how flexible are you with your service restart and how frequently do you
wish to update your cache ?

Does IANA have an easy way of determining whether there is an update since
a certain date ? I was thinking it might make sense to just run a scheduled
job to monitor for update and then restart your service or refresh your
local cache depending upon how you solve it.

On Mon, Sep 14, 2020 at 12:34 AM Wesley Peng <wpeng@pobox.com> wrote:

> Hello
>
> Mithun Bhattacharya wrote:
> > How frequently do you wish to refresh the cache ? if you do in startup
> > then your cache refresh is tied to the service restart which might not
> > be ideal or feasible.
>
> I saw recent days IANA has updated their database on date of:
>
> 2020.09.09
> 2020.09.13
>
> So I assume they will update the DB file in few days.
>
> Regards.
>
Re: cache a object in modperl [ In reply to ]
Mithun Bhattacharya wrote:
> Does IANA have an easy way of determining whether there is an update
> since a certain date ? I was thinking it might make sense to just run a
> scheduled job to monitor for update and then restart your service or
> refresh your local cache depending upon how you solve it.

Yes I agree with this.
I may monitor IANA's database via their version changes, and run a
crontab to restart my apache server during the non-active user time
(i.e, 3:00 AM).

Or do you have better solution?
Thanks.
Re: cache a object in modperl [ In reply to ]
Haha I can't answer that - I work with systems which are always up. We have
users working across the globe so there is no non-active time.

In my case I would have to throw an independent cache (my current choice is
REDIS but you could chose a DB_File for all I know) and refresh it as
needed - IANA I could hit every 30 min to check for update :)

On Mon, Sep 14, 2020 at 12:44 AM Wesley Peng <wpeng@pobox.com> wrote:

>
>
> Mithun Bhattacharya wrote:
> > Does IANA have an easy way of determining whether there is an update
> > since a certain date ? I was thinking it might make sense to just run a
> > scheduled job to monitor for update and then restart your service or
> > refresh your local cache depending upon how you solve it.
>
> Yes I agree with this.
> I may monitor IANA's database via their version changes, and run a
> crontab to restart my apache server during the non-active user time
> (i.e, 3:00 AM).
>
> Or do you have better solution?
> Thanks.
>
Re: cache a object in modperl [ In reply to ]
On Sun, Sep 13, 2020, at 21:51, Wesley Peng wrote:
> For work requirement, I need to access IANA TLD database.
>
> So I wrote this perl module:
> https://metacpan.org/pod/Net::IANA::TLD
>
> But, for each new() in the module, the database file will be downloaded
> from IANA's website.
>
> I know this is pretty Inefficient.

Not only inefficient but you abuse remote resources and you risk having
your access being rate limited or just blocked.

You should use caching features available by HTTP as the resource has an ETag:

$ wget -SqO /dev/null http://www.internic.net/domain/root.zone
HTTP/1.1 200 OK
Date: Mon, 14 Sep 2020 15:17:50 GMT
Server: Apache
Last-Modified: Mon, 14 Sep 2020 05:44:00 GMT
Content-Length: 2164237
Vary: Accept-Encoding
ETag: "21060d-5af3f856f0800"
Accept-Ranges: bytes
Cache-Control: max-age=420
Expires: Mon, 14 Sep 2020 15:22:04 GMT
X-Frame-Options: SAMEORIGIN
Referrer-Policy: origin-when-cross-origin
Content-Security-Policy: upgrade-insecure-requests
Age: 165
Keep-Alive: timeout=2, max=358
Connection: Keep-Alive
Content-Type: text/plain; charset=UTF-8
Content-Language: en


So you can do a conditional GET as long as you store the latest ETag
on your side:

$ wget -SqO /dev/null --header 'If-None-Match: "21060d-5af3f856f0800"' http://www.internic.net/domain/root.zone
HTTP/1.1 304 Not Modified
Date: Mon, 14 Sep 2020 15:20:43 GMT
Server: Apache
Connection: Keep-Alive
Keep-Alive: timeout=2, max=358
ETag: "21060d-5af3f856f0800"
Expires: Mon, 14 Sep 2020 15:22:04 GMT
Cache-Control: max-age=420
Vary: Accept-Encoding


All of this has nothing to do with modperl and very lightly to do with Perl at all in fact.

See also the "Cache-Control" and "Age" headers.

Your module on CPAN should take care of that automatically.

PS: TLDs do not vary so much, fetching once per day or once per week should be enough (with manual exceptional override for those cases that need it). But it depends why you do it. Note that the whole content is also available as a zone transfer from various root servers.

--
Patrick Mevzek