Re: [PHP] Aggressive PHP Smart Caching



Alexander,

i have begun to experiment w/ your caching tool.
i wonder if you would mind providing some feedback..

firstly i noticed a call to ob_start(),

ob_start('cacheSave');

which references the method defined beneath the initial bit of code,
cacheSave(). in order to actually send anything to the client then
it is the responsibility of the user of the caching tool to call
ob_end_flush() thus, invoking the method cacheSave()

it is also important that the script calling ob_end_flush() have access to
the definition of cacheSave() so it may have to be copied
into that script, or broken out into a separate script in order to be
included in the script calling ob_end_flush().

there is also another thing i ran into trouble with. that is the call to
the header method in the first bit of code in the cache tool.
header('Expires: '.gmdate('D, d M Y H:i:s',$cache_expires).' GMT');

the problem is that other scripts in the application may already be calling
the header() function, and therefore an error will be raised
since headers have already been sent by the cache code. would it negatively
impact the design of the cache to place the header()
invocation after the call to ob_start() ? that way it would not impact
other scripts that call header() since none of them will actually
be sent until the call to ob_end_flush().

unfortunately, i have been unable to get the tool to cache a single page
yet. mainly because this is my first real experience with
output buffering. i am eager to get it working, and if i can get it to
work; will likely see it installed on a production system. looking
forward to hearing from you.

thanks,

-nathan


On 6/26/07, Alexander Romanovich <andala@xxxxxxxxx> wrote:

Thank you for your reply Nathan.

You are right that this method of caching is different than the two types
you have outlined below. I would not say that it is a new method though, in
fact, "pushing static files" to the server is very common. If it weren't for
the fact that this method, as I have designed it, allows a very tiny PHP
overhead to handle dynamic updating of the cache I could have even gone the
extra mile to push html files that would be loaded directly by the end user
without PHP being initialized at all. (My reasons for not taking this last
step should become apparent to those who read the wishlist I produced at
http://technologies.babywhale.net/cache/ )

Understand that this method does not *exclude* using the other two methods
you have outlined. In fact, I personally make use of memcached and APC where
I feel it is appropriate in my application design. This does not mean that I
can not also write a cache layer that makes the application itself and its
variables irrelevant and not required for most site hits (hence a major
optimization).

To answer your other questions:

1) Caching on disk could easily be handled instead by caching in memory,
but this approach is meant to be ultra-portable and work everywhere. There
are situations where a viable memory storage mechanism is simply not
available, and other cases where it is not desirable to consume memory for
this purpose and plenty of hard drive storage space is a good alternative. I
think you will find this caching method is intensely speed-tuned and a
fast implementation of a portable file system based method. I would also
point out that in my line of work, where I chiefly have to adopt
environments that are configured under rather political circumstances, it is
consistently this type of caching that the system administrators argue
for. As someone has already pointed out, there may not even be a significant
difference between disk and memory based storage mechanisms on your server.

2) Again, one of the main theories behind this method is portability. In
order to not rely on cron, server queries, or other external checks for a
stale cache, I have gone with a "refresh interval" which has been proposed
on this list in the past. It proposes that dynamic content should be
refreshed once every X seconds/minutes/hours. This script avoids PHP date
manipulations and instead performs some basic math to handle the refresh
rate, but also to *sync* content to some degree, so portions of dynamic
content are less likely to haphazardly refresh independently and therefore
not match. I think this is a slight improvement over code that has been
posted here before. In a practical sense, this means that your application
fires and produces content only once every X minutes, and not each and every
time the page is hit. Furthermore, because in this case it is known ahead of
time when that page will expire, a cache header can be sent with an exact
expiration time so repeated hits by the same end user will not even trigger
a transmission of cached content from the server.

3) In regards to daily purging: for one, if you are going for a scheduled
refresh of content, then you probably already have a refresh rate that is
less than 24 hours, so accepting an additional daily trigger of recaching
should not be unacceptable. But more specifically, the reason behind this is
that a file system based caching method does not natively support a TTL on
cached files, and there has to be some way to handle a cache of a script
that has since been deleted. Note that if 24 hours is not acceptable for
some reason, this script can easily be modified to increase that without
negatively affecting anything else.

On Jun 24, 2007, at 11:55 PM, Nathan Nobbe wrote:

Alexander,

sorry to see nobody has replied to your post, im sure you worked very hard
on the cache system and are eager for feedback..

so to me it looks like youve introduced a somewhat new style of cachinghere (though im sure there are other such approaches); for instance i know
of 2 main uses for caches at this time [as caching pertains to php].

1. caching php intermediate code
2. caching application variables

both of these caching techniques are designed to overcome limitations of
the language as it ships out of the box, more or less; afaik.
it appears you are interested in caching the output of php scripts, which
is, i suppose, a third technique that could be added to the list.
so i have a criticism about your system and a couple questions as well.
*criticism*

- why cache script output on disk? if a fast cache is your goal,
why not store the result of script output in memory rather than on disk;
that would be much faster

*questions*

- how does your cache system know when cached output is stale and
allow fresh contents to be delivered from the original script rather than
being served from the cache?
- why purge cache contents after 24 hours? im on the memcached
mailing list, and recently they were discussing artificially resetting the
cache; several people said they let memcahe run for months on end.

-nathan







Relevant Pages

  • Re: Caching via URL
    ... > I need to implement a caching mechanism for my script URLs. ... This way the browser will cache the page. ... You can have a script named browse. ... Apache to handle it as PHP script. ...
    (alt.php)
  • Re: [PHP] Aggressive PHP Smart Caching
    ... If it weren't for the fact that this method, as I have designed it, allows a very tiny PHP overhead to handle dynamic updating of the cache I could have even gone the extra mile to push html files that would be loaded directly by the end user without PHP being initialized at all. ... Caching on disk could easily be handled instead by caching in memory, but this approach is meant to be ultra-portable and work everywhere. ... In order to not rely on cron, server queries, or other external checks for a stale cache, I have gone with a "refresh interval" which has been proposed on this list in the past. ... This script avoids PHP date manipulations and instead performs some basic math to handle the refresh rate, but also to *sync* content to some degree, so portions of dynamic content are less likely to haphazardly refresh independently and therefore not match. ...
    (php.general)
  • Re[2]: [PHP] dynamic -> static
    ... As I wrote already this issue is mainly because of search engines ... incompatibility with dynamic content sites (to be more exact - with ... You can cache your code using PHP Accelerator or Turck ... MM> implementing a caching layer between your application and your database. ...
    (php.general)
  • Re: [PHP] dynamic -> static
    ... You can cache your code using PHP Accelerator or Turck ... implementing a caching layer between your application and your database. ... 404 causes it to error out regardless of the content of the response. ... with any caching system as above - finding and flushing stale data. ...
    (php.general)
  • Re: GC *and* Universal Group Caching
    ... Appreciate the extra info. So, I take it that if I>>authenticate to a DC that is a GC and caching is turned on pointed at>>another domain, the DC will get Universal group membership from it's>>cache as its own behavior has been altered to do so. ... >>>>>>-->>>Joe Richards Microsoft MVP Windows Server Directory Services>>>www.joeware.net>>> ...
    (microsoft.public.windows.server.active_directory)