Re: CL Scaling for High Traffic Web Sites



On Apr 29, 11:09 pm, bob <papersm...@xxxxxxxxx> wrote:

The idea of memcached is to do make use of existing memory for as
cheaply as possible, retrieving data from multiple machines through a
centralized interface, while offering orders of magnitudes of
performance over disks.

My experience, based on having worked at a significantly large site
was that the best approach by far was to a customised traditional web
cache (a distributed squid in some form or other) to cache everything
that could "easily" be cached. This typically is more than 90% of
your traffic. For the remaining traffic I think that non-stupid
design of the system would mean that a decent database can cope. (in
fact I know this is true).

The front-end cache needs significant thought - we had a very smart
guy who did ours, basically replacing some expensive commercial thing
which killed the site more-or-less every day with a customised squid
which worked very well. Even then there were things it was critical
to get right - for instance we had some imagemap which caused URLs to
come in which had coordinates in which were essentially always a cache
miss and this used to kill the site frequently. Fixing this made a
huge difference.

The non-stupid design thing also matters. At some point I worked out
that we were averaging 1000 requests a second at the front end, well
over 90% of which were satisfied from cache, and the back end database
was sustaining 4000 IOPS. So that's worse than 400 IOPS per uncached
request. I kind of realised we were doomed at that point.

--tim

.



Relevant Pages

  • [PATCH] Documentation: move rpc-cache.txt to filesystems/
    ... - allowing an EXPIRED time on cache items, ... - making requests to user-space to fill in cache entries ... +This will be passed to ->match to identify the target entry. ... +This directory contains a file called 'channel' which is a channel ...
    (Linux-Kernel)
  • Re: Overlapped IO with error 0x800705AD
    ... In cached mode, cache manager would preload the data ahead of requested. ... could also get cached data without an access to an actual disk. ... The bad return of ReadFile and WriteFile was effectively due to the ... And I have best perfs by queueing more requests by thread (best ...
    (microsoft.public.win32.programmer.kernel)
  • Re: if-modified-since question (protocol problem?)
    ... A refresh action using F5, an equivalent key combination, or the Refresh GUI control, is not navigation in the sense of activating a link, loading a bookmark, or typing a URL. ... Memory cache: Automatic ... A total of five requests will be made in each run, with Ethereal capturing all inbound and outbound HTTP traffic over my WAN interface. ... The first run using the initial settings: ...
    (comp.infosystems.www.authoring.html)
  • Re: disk write barriers
    ... specially crafted i/o requests ... barrier request will be completed before any request that follows the ... reenable the cache on a barrier. ... the driver and/or the disk could reorder writes at will). ...
    (freebsd-questions)
  • Re: Track DNS Requests
    ... > look at a cache of requests. ... Technically this is more a cache of responses than requests since ... DNS server had to ask other servers. ...
    (microsoft.public.windows.server.dns)