gz compression rates with custom buffer callback



Hi,

First, thanks to those who'll read me 'til the end, I know my code can seem a bit messed up, that they may be pretty obvious solutions, that I may speak some bad English; but optimization madness bring me here. I'm sure you'll understand :).

Basically, what I want to do is a classic output-buffer callback function that gzencode() the buffer. There's a native function for that, I know, but I want a little more: compression stats (and maybe even more later).

Here's the idea: (don't scream, I'll explain)

# gz stats (buffer callback)
function gz_cmp($buffer) {
$buffer = preg_replace_callback(
'/\$gz\-stats=(\d+)\$/',
create_function(
'$status',
'if( $status[1] ) {
# ' . ($gz_activated = true) . '
# ' . ($s_size = strlen($buffer)) . '
# ' . ($c_size = strlen(gzencode($buffer, 9))) . '

return ' . round((100 - $c_size / $s_size * 100), 1) . ' . \'%\';
}'
),
$buffer
);

if( $gz_activated ) {
header('Content-Encoding: gzip');
$buffer = gzencode($buffer);
}

return $buffer;
}

gz_cmp() = GZ Compression callback function
$status = array('$gz-stats=1', '1')
-> $status[1] = GZ Compression availability (1 or 0)
$gz_activated = $status[1], as a flag for the function's ending
$s_size = Plain-text buffer (document) size
$c_size = GZ compressed buffer (document) size
[weird formula] = Compression rate (in %)


Explanations:

Somewhere in the page is output $gz-stats=1$ or $gz-stats=0$, depending on whether GZ comp is used or not (through a constant and a few checks as GZ module availability and browser's Accept-Encoding HTTP header; well, whatever). Of course, 1=Enabled and 0=Disabled.

Now, the output buffer ends, place to the callback function: gz_cmp(). The first thing that comes into your mind might be that we have to search for $gz-stats=x$ and THEN replace it by the stats, actually encoding the doc, sending Content-Encoding header,.. that stuff; OR simply return back the buffer unchanged if $gz-stats=0$.

But, this means two regexp searches in the whole doc in case GZ is activated: one for the check of $gz-stats=x$, one for the replacement.
-> I want one.

Thus, I thought. I ain't genius and that's probably why it isn't really working as expected, but here's the idea: to directly make the replacement using preg_replace_callback() which, as it name implies, calls some function back too. The only argument passed to the callback function is an array of matches, with the first value for the whole found pattern and the rest for each parenthesis. I got only one parenthesis which should only be 1 or 0 (GZ activated or not), at the second value of the array.

I decided to make a lambda callback function for the replacement (or 'anonymous function') with create_function(). This permits me to get the buffer sizes (original and compressed) from outside the replace callback function (as I can NOT pass them as parameters). The other advantage I thought this system would give me was that I could set an external flag ($gz_activated) from within the lambda callback in order to ACTUALLY encode the doc AFTER having replaced $gz-stats=1$ by the compression stats (which is a simple rate in %, by the way).

Why so much complications? I don't want to use globals. I know you thought of it ;). Portability purposes only.

This stuff seems to work great, as you can see, I comment some lines in the lambda callback to set the flag and compute the lengths outside the string. Then I return the stats (which, by the way, turns around 80%, GZ rocks!) to the preg_replace() function which will replace the $gz-stats=1$ with them. Once done, the result is stored in $buffer (which is actually updated). The flag is set, so I can now send the HTTP header to tell the browser we're gonna send some encoded stuff, and then actually encode it.

The $buffer, now modified and encoded, is eventually returned, the output buffer is flushed and here we go.


\o/. Or not..
Now I try to set $gz-stats=0$. The stats aren't displayed, as expected, but after having sniffed the headers, I found the content was still encoded in GZ. For whatever reason $gz-stats$ as been set to 0 by the main script, so we don't want it.

The reason?
Apparently, PHP parses all the lambda function code twice. First to 'decode' it (don't forget it's just a string), and then to execute it properly. Well, it's my guess anyway.

As you can imagine, it's impossible (at least I think) to pass custom parameters to the OB callback, thus I found myself screwed. I'm now asking you: would you imagine any solution to
- get the flag out of the lambda func ONLY when expected, OR
- get the GZ Availability value into the OB callback by any other way..

...knowing that I can't bear with globals, and that I'd already forget to reload the page after storing the value in any dead mem, if I were you ^^'.

Is that some kind of challenge, or am I just blind? I think I got into something that maybe isn't of my level -.-'

Thanks for all !

-thib´

PS If you have a totally different solution, I wouldn't mind throwing all of this away; it always hurts, but I think I got used to. =P.
.



Relevant Pages

  • Re: callback function immediatly called with an empty buffer
    ... > I try to record a wavefile with wavein API. ... > Waveinprepareheader and waveinAddbuffer work fine, with a waveheader ... > buffer large to hold a few seconds audio. ... > Strangely, just after waveinstart, the callback function is executed, ...
    (microsoft.public.win32.programmer.mmedia)
  • Re: sending HTTP requests... newbie (continued)
    ... places, find the missing .dll files, rediscovering that a lot of file are missing from the different versions of the download libraries, ... constant and the pointer to your callback function. ... typedef struct CurlBuffer ... CurlBuffer buffer; ...
    (comp.programming)
  • IOCTL_DISK_READ
    ... LPBYTE lpOutBuf; ... DeviceIoControl returns true and GetLastError returns 0. ... because my callback function is never called. ...
    (microsoft.public.windowsce.platbuilder)
  • Re: ISampleGrabber
    ... I setup the address to my callback function in SamplGrabber filter.. ... But when it runs I see blank black image (the zero content of the Sample ... I'm not sure how you are determining that the buffer is full of ...
    (microsoft.public.win32.programmer.directx.video)
  • WME 9: Encoding a file, while it is still being written by a device
    ... I have a .wav file that is in writing by an external ... encode the various buffer in a low bandwidth format (Windows Media Voice ... I simply need to encode those buffers before ...
    (microsoft.public.windowsmedia.sdk)