Snipe.Net Geeky, sweary things.

Quick and Dirty PHP Caching

Q

Caching your database-driven website pages has a plethora of benefits, not the least of which being improved speed and reduced server loads. This article will explain how to set up a simple caching system, and will also address when and where caching might not be appropriate.

For me, the impetus to switch to a caching method for one of my database driven sites was sparked by Mosso, since they bill by cpu cycle, and I have one site that is, well, humongous (60k+ pages), and it happens to the highest traffic site on the account. While the database queries were all very efficient, and each page had, on average, no more than 6 queries, performance and cpu cycles would both be helped quite a lot by implementing a cache. This caching solution was a temporary fix, while we switched to a new CMS that was already using a robust caching system. It’s quick, it’s dirty, but it got the job done for the interim.

We’ll walk through how to execute a simple PHP cache, and then I’m going to explain how doing so without a little forethought will screw you right in the ear. Note that this is called a Quick and Dirty solution for a reason. There are more complex, more efficient methods available, but this covers some basics.

Using output buffering, caching pages is incredibly easy. Simply put, output buffering allows you to control when output is sent from the script. This is particularly handy if you’re using cookies or sessions or some other process that sends headers to the browser before the page loads (as anyone who has gotten those pesky “headers already sent” errors can tell you.)

Please note that this article assumes your cache files will be created in a directory called ‘cache’ – and that this cache directory must be writable by the webserver.

Please also note: the syntax highlighter was made of fail for this article and was double, sometimes triple converting HTML entities. I have fixed it a dozen times, and then every time I edit the post, I have to fix it all over again. So if you notice any funky characters that don’t look like they belong in the script snippets, they probably don’t. Let me know and I’ll fix them, yet *again*.

The basic stuff

In all its 6-lines of glory, this is actual, working caching code.

[source=php]// TOP of your script
ob_start(); // start the output buffer
$cachefile =”cache/cachefile.html”;
// Your normal PHP script and HTML content here
// BOTTOM of your script
$fp = fopen($cachefile, ‘w’); // open the cache file for writing
fwrite($fp, ob_get_contents()); // save the contents of output buffer to the file
fclose($fp); // close the file
ob_end_flush(); // Send the output to the browser[/source]

There are, of course, two major flaws with just using the script above. First, we’re always writing to cachefile.html file, which would only be useful to you if your website was only one page. And second, notice that the script writes to the cache, but never actually retrieves the cache file – it’s still running through the whole script every time. But, this is just the beginning. That’s all there is to the actual caching part – the rest of this article will deal with the when/where of caching, but the how is that right there.

Which brings us to the next step… adding the ability to check whether or not a cache file exists, and use that instead of running through the normal script. We’re going to keep using the one-page website model for now, but I’ll get into creating cache files for different pages later.

Checking for a cache file

Creating the cache file from database-driven content is easy, as we’ve seen – but it’s only useful if we actually check if a cache file exists and serve that instead of live database output. Using he modification below, we are checking to see if a cache file already exists and if it does, include it and exit instead of running through the normal PHP script.

[source=php]// TOP of your script
ob_start(); // start the output buffer
$cachefile = ‘cache/cachefile.html’;
if (file_exists($cachefile)) {
// the page has been cached from an earlier request
include($cachefile); // include the cache file
exit; // exit the script, so that the rest isn’t executed
} [/source]

This is marginally more useful, since it actually prevents the script from executing if a cache file exists, however the way this is currently written, it will include that file for an indefinite time, never actually executing your full script again. Normally, in a cache situation, we want the ability to “expire” content after a certain time, so an updated version will be displayed and cached. You could automatically force a new page cache file to be generated by setting a cron job to automatically delete your cache files every hour/day/week/whatever – or you could handle this on the script level.

Setting cache urls

In our examples, we’ve been using cache/cachefile.html as the filename for the cache file that is generated. As I mentioned, this is great if your site is only one page, but otherwise every page this script is run on will create the same cache file, so you’ll end up serving the same cached file as content for every page on your site. Not awesome.

The easiest way to create individual cache files for each specific page is to do something like this:

[source=php]$cachefile = basename($_SERVER[‘SCRIPT_URI’]);[/source]

This takes the unique url of the page requested and and uses that as the cached filename.

But, there’s a gotcha. If your site uses pages that pass GET requests, such as a search page, etc – the SCRIPT_URI won’t see that as part of the url, so once someone does a search, all subsequent search requests will serve that same cached file unless you make the file name unique to each GET request.

In other words, if your search is located at yoursite.com/search.php, and when someone performs a search, the url looks something like yoursite.com/search.php?q=foo, PHP sees that url as search.php, regardless of the query string. So basically, it will break your search, big time.

NOTE: It may not be worth caching every GET request if your site doesn’t get a lot of traffic to files that use this. Or if disk space is a concern. Since there are a potentially unlimited number of GET strings that could be passed to your script (even bogus ones that don’t return valid results on your site), you may want to evaluate whether or not caching search pages is appropriate. In my case, it was – but it may not be for everyone. At the very least, if you opt to do this, make sure you’ve got some sanity checking in there so some asshole with a grudge can’t just sit there creating new, bogus query strings to eat up your disk space.

If you decide to cache query string data, you could do something like this:

[source=php]$cachefile = basename($_SERVER[‘SCRIPT_URI’]);
if ($_SERVER[‘QUERY_STRING’]!=”) {
$cachefile .= ‘_’.base64_encode($_SERVER[‘QUERY_STRING’]);
}[/source]

This basically just grabs the file name, checks to see if there is any GET data passed and if there it, it generates a url-safe base64-encoded sting that you can use as your cache file name.

Setting an expiration

You have three basic options for expiring your cache:

  1. Set up a cron job to automatically delete all of your cache files at specified intervals
  2. Check the data source file for modification, and expire it if the source file is newer than the cache file
  3. Check the timestamp of the cache file and delete+regenerate if it is older than x

Cron Job: Setting up a cron job to delete your entire cache at specific intervals is arguably the easiest solution, but not really the most efficient, especially with very large websites. Rather than just deleting the page that’s been determined to be expired, you’re deleting (and then subsequently regenerating) a large number of files in one shot.

Data Source: Checking the data source file for modification is potentially the smartest way to handle caching, since it means the cache would never be expired if the data didn’t change. That certainly makes sense to do, since a page that hasn’t been updated doesn’t need to be regenerated, so you’re really getting the most bang for your caching buck.

The problem arises when you’re caching pages that are dynamically generated based on database records. The actual script that generates the data may not have been changed for quite some time, but the data records you’re fetching from the database may have been changed, so just checking the cache file date against the date the script was last modified will not give you what you need.

A workaround there would be that you could do a quick db query at the top of every page to find out when the record was last modified and compare that to the modification time on the cache file, but that means that every page, even your cached pages, will be performing a database hit on every page load. This may be perfectly acceptable to you, but it’s something to consider. Perhaps a better way of handling this would be to modify the content management system by which you publish content, so that the cache file is only deleted when you publish edits. This method would be the most thorough and efficient way, since your cache file would only be updated when you update something, and would be left to be served statically unless the data has changed. Although that’s outside the scope of this quick and dirty article, extending the code below to accommodate that wouldn’t take much work.

Cache Timestamp: We’re going to address the third option, since it’s the most commonly used and would serve as the foundation for the second option anyway.

[source=php]// TOP of your script
$cachefile = basename($_SERVER[‘SCRIPT_URI’]);
$cachetime = 120 * 60; // 2 hours
// Serve from the cache if it is younger than $cachetime
if (file_exists($cachefile) && (time() – $cachetime < filemtime($cachefile))) { include($cachefile); echo "“;
exit;
}
ob_start(); // start the output buffer [/source]

This script gets the file name, sets a cache time, checks to see if the cache file exists, and if it does, it checks if the cache file is younger than the cachetime. If the cache is still valid, it includes the file and exists the script. If not, it will continue on to execute the script and create a new cache file. It also tacks on a comment at the very end of the cache file that tells you when the file was cached. This can be helpful in debugging, and helping you verify that the page you’re seeing is in fact a cached version, not a live version. (You can see this in action if you view the source of this page and look down at the very bottom of the source code.)

The script, the whole script and nothing but the script

Put all together, this is what our caching script looks like:

[source=php]// TOP of your script
$cachefile = ‘cache/’.basename($_SERVER[‘SCRIPT_URI’]);
$cachetime = 120 * 60; // 2 hours
// Serve from the cache if it is younger than $cachetime
if (file_exists($cachefile) && (time() – $cachetime < filemtime($cachefile))) { include($cachefile); echo "“;
exit;
}
ob_start(); // start the output buffer
// Your normal PHP script and HTML content here
// BOTTOM of your script
$fp = fopen($cachefile, ‘w’); // open the cache file for writing
fwrite($fp, ob_get_contents()); // save the contents of output buffer to the file
fclose($fp); // close the file
ob_end_flush(); // Send the output to the browser[/source]

Gee… Oh… Cache challenges

I know. Going to hell for that awful joke. Moving on…

Caching is a great way to speed things up on dynamic sites and save on server resources – however if your site has any kind of more advanced features, you need to be selective about where you apply it. The cache is not smart, so you have to be. Ideally, you’ll be building your caching system into the site as you develop the site and the content administration system – but if you end up having to add caching later, you really have to think everything through.

Examples of things that WILL break if you use caching unless you specifically work around them:

User login: “Welcome, user” logged in functionality (the first user who logs in will create the cache, and everyone else logging in will see their name instead of their own!

Voting: If you have any kind of voting functionality built into your pages, new votes will not be captured and old ratings will be displayed

Anything requiring a POST request: Same as above the first person submitting the form will get correct results, but anyone submitting it after them will get the first user’s cached results.

Geo-IP lookup: If you’re displaying geographically relevant information to the user based on their IP address, the same rules apply. The first user hitting your site will create the cache file and everyone else accessing it will see their geographic results instead of their own.

And so on…

That said, all hope is not lost. Depending on the situation and what functionality I’m trying to preserve, I usually handle this one of two ways:

Only serve cached files to users who are NOT logged in. This takes care of a lot of the issues right there – if a user has a profile preferences page, email preferences page, or whatever – all of these will be cached by the first user accessing them. The easy way around this is simply to serve live data to the user if they are logged in, cached pages if they are not. This will reduce the effectiveness of your caching system to some degree, but many users never both logging in, so you’re still getting a significant savings. (If 90% of your site’s content is only available to logged-in users, you may need to rethink your caching system though.)

Use AJAX. This is one of the few situations where AJAX really can be 100% appropriate. Since AJAX requests are asynchronous and are not cached, this is a great solution for your voting script situations. Mind you, you should make sure your solution degrades gracefully for users who have javascript turned off.

Only cache parts of your page instead of the whole thing. With a little more work, you can set up your caching system to only cache parts of your page, and not the entire page. This may reduce the effectiveness of the caching system, but may be necessary depending on your situation.

One final gotcha

You should consider a graceful way of handling database failures as well. Say you have your cache time set for 3 days – a long time by some standards, but not at all unreasonable if you have disk space to spare and your content doesn’t update that often. If your database throws an error when your cache file is being regenerated, that error will continue to be displayed for 3 days, even if the database error has been corrected. You should consider how to handle that gracefully, even if its a cheap and dirty method. For example, you could set up a website monitoring service that notifies you when your content has changed. If your page isn’t loading properly, you’ll be notified by text or email, and that will give you the opportunity to fix the error and manually blow out your cache so it can regenerate.

A note for WordPress users

If you’re using WordPress and are looking for a way to reduce server load and speed your blog up, you’re in luck. WP-Super Cache is an unparalleled caching solution for WordPress that is basically plug-and-play, no coding required.

Caching Libraries

Thanks to the fabulous comments to this article (and I genuinely do mean that), I am reminded to remind you that this method is exactly what it says it is – quick and dirty – and it makes NO attempt to be the best solution to your caching needs. It is as much an exercise in considering where caching is appropriate (and inappropriate) as much as it is anything else.

For more sophisticated (and certainly more elegant) solutions, check out PEAR’s Cache_Lite , xCache (lighthttpd), eAccelerator and Zend_Cache, and read up on APC and memcached.

About the author

snipe

I’m a tech geek/dev/infosec-nerd/scuba diver/blacksmith/sword-fighter/crime fighter/ENTP/warcrafter/activist. I run Grokability, Inc, and run several open source projects, including Snipe-IT Asset Management. Tweet at me @snipeyhead or read more...

  • You might also consider using APC – it allows for an automatic TTL argument (optional third parameter).

    When I cache with APC I use an MD5 of the $_SERVER[‘REQUEST_URI’] variable along with other things (for example, I might take user information to cache for specific users if I’m serving cached files to users).

    It’s worth noting that APC is about 50% faster than reading a file on every page.

  • You might also consider using APC – it allows for an automatic TTL argument (optional third parameter).

    When I cache with APC I use an MD5 of the $_SERVER[‘REQUEST_URI’] variable along with other things (for example, I might take user information to cache for specific users if I’m serving cached files to users).

    It’s worth noting that APC is about 50% faster than reading a file on every page.

  • Great point, Brandon! A smidge out of scope for thw quick and dirty version, but great to bring up. (Since its not bundled with PHP, I didn’t want to get into determining if they have it installed, etc.) Would be perfect for a future followup on a more advanced/less dirty caching system 🙂

    Info on APC is available here:
    http://us2.php.net/apc

    Note that its a PECL extension that may or may not be installed on your server.

  • Great point, Brandon! A smidge out of scope for thw quick and dirty version, but great to bring up. (Since its not bundled with PHP, I didn’t want to get into determining if they have it installed, etc.) Would be perfect for a future followup on a more advanced/less dirty caching system 🙂

    Info on APC is available here:
    http://us2.php.net/apc

    Note that its a PECL extension that may or may not be installed on your server.

  • Ah fair point. I expect every developer worth his or we salt knows about and uses APC or some similar caching engine. And as of PHP 6 APC will be compiled in by default. I always compile PHP with APC.

    Brandon Savage’s last blog post..What Matters Most (Job Hunt Advice)

  • Ah fair point. I expect every developer worth his or we salt knows about and uses APC or some similar caching engine. And as of PHP 6 APC will be compiled in by default. I always compile PHP with APC.

    Brandon Savage’s last blog post..What Matters Most (Job Hunt Advice)

  • Right, but some dev’s don’t have that level of control over their servers. 🙂

  • Right, but some dev’s don’t have that level of control over their servers. 🙂

  • toto

    It is important to pay attention to disk accesses! Because instead of (maybe) having 1 single php page connected to a database (hopefully in RAM), you will have a single page with lots of disk accesses (fopen).

    It may be important to generate the caches in a folder in memory:
    http://kevin.vanzonneveld.net/techblog/article/create_turbocharged_storage_using_tmpfs/

  • toto

    It is important to pay attention to disk accesses! Because instead of (maybe) having 1 single php page connected to a database (hopefully in RAM), you will have a single page with lots of disk accesses (fopen).

    It may be important to generate the caches in a folder in memory:
    http://kevin.vanzonneveld.net/techblog/article/create_turbocharged_storage_using_tmpfs/

  • toto – thanks for the comment! Unfortunately, not everyone has access to /dev/shm on their machines.

  • toto – thanks for the comment! Unfortunately, not everyone has access to /dev/shm on their machines.

  • Incidentally though, you’d only have a single page with lots of fopens if a lot of people hit the exact same page at the exact same time – and this is not meant to be a high-performance caching tutorial for extremely high traffic sites. That’s why its called “quick and dirty” 🙂 The whole point of checking the filemtime ob the cached file is to prevent excessive fopens and fwrites if they are not needed.

  • Incidentally though, you’d only have a single page with lots of fopens if a lot of people hit the exact same page at the exact same time – and this is not meant to be a high-performance caching tutorial for extremely high traffic sites. That’s why its called “quick and dirty” 🙂 The whole point of checking the filemtime ob the cached file is to prevent excessive fopens and fwrites if they are not needed.

  • Or, instead of using home grown code, use an established caching library to take care of these details for you. Solutions such as PEAR’s Cache_Lite and Zend_Cache give you the ability to manually expire a cache or sets of caches by tag, solving the “updating content” problem; PEAR’s Cache and Zend_Cache also allow you to cache to a variety of backends (such as APC and memcached), which allows you to test caching locally using one backend but scale out in production using another. These solutions all allow you to also cache native PHP variable types — allowing you to separate the caching from the display (display logic is usually _not_ the bottleneck, while accessing your models _is_).

  • Or, instead of using home grown code, use an established caching library to take care of these details for you. Solutions such as PEAR’s Cache_Lite and Zend_Cache give you the ability to manually expire a cache or sets of caches by tag, solving the “updating content” problem; PEAR’s Cache and Zend_Cache also allow you to cache to a variety of backends (such as APC and memcached), which allows you to test caching locally using one backend but scale out in production using another. These solutions all allow you to also cache native PHP variable types — allowing you to separate the caching from the display (display logic is usually _not_ the bottleneck, while accessing your models _is_).

  • Hi Matthew – absolutely, there are lots of great libraries out there, with clear advantages to the quick and dirty solution – a good idea for part two this article. I never meant to imply that this is the best way of doing things – just that it’s *a* way, hence the title 🙂

    I for one do not have access to memcached or APC on Mosso (although I’ve heard rumors that they’re considering setting up a memcached server – totally unsubstantiated at this point, but I am hopeful).

  • Hi Matthew – absolutely, there are lots of great libraries out there, with clear advantages to the quick and dirty solution – a good idea for part two this article. I never meant to imply that this is the best way of doing things – just that it’s *a* way, hence the title 🙂

    I for one do not have access to memcached or APC on Mosso (although I’ve heard rumors that they’re considering setting up a memcached server – totally unsubstantiated at this point, but I am hopeful).

  • toto

    Hi Snipe! You are right, unfortunately /dev/shm is not accessible to most of the hosting plans. 🙁

    @Matthew: Pear packages are a big mess. But Cache_lite is the solution that we have selected. Because it is independant from the heavy pear class. We only had to do one easy modification, so that the error management does not call any other pear class.

    Zend classes are well written, PHP5, but….. the invocation graph is a nightmare…
    Zend cache class has a tagging method which is the best solution (compared to Pear) to manage the cache life cycle.

    BTW, thanks for your article which was really interesting. 🙂

  • toto

    Hi Snipe! You are right, unfortunately /dev/shm is not accessible to most of the hosting plans. 🙁

    @Matthew: Pear packages are a big mess. But Cache_lite is the solution that we have selected. Because it is independant from the heavy pear class. We only had to do one easy modification, so that the error management does not call any other pear class.

    Zend classes are well written, PHP5, but….. the invocation graph is a nightmare…
    Zend cache class has a tagging method which is the best solution (compared to Pear) to manage the cache life cycle.

    BTW, thanks for your article which was really interesting. 🙂

  • @toto Zend_Cache can be used as a standalone component — you can cherry-pick it from the distribution or via the CodeUtopia packagizer. As for the call graph — well, look at the call graphs of most frameworks; besides, it’s irrelevant here as you’d only be using a single component, not the entire MVC. (Just as an historical note, the Cache_Lite developers contributed Zend_Cache, and it is the evolution of Cache_Lite.)

  • @toto Zend_Cache can be used as a standalone component — you can cherry-pick it from the distribution or via the CodeUtopia packagizer. As for the call graph — well, look at the call graphs of most frameworks; besides, it’s irrelevant here as you’d only be using a single component, not the entire MVC. (Just as an historical note, the Cache_Lite developers contributed Zend_Cache, and it is the evolution of Cache_Lite.)

  • Oh, and Snipe — it _is_ a good article. 🙂 I just wanted to point out that there are existing solutions that overcome many of the problems you’ve outlined.

  • Oh, and Snipe — it _is_ a good article. 🙂 I just wanted to point out that there are existing solutions that overcome many of the problems you’ve outlined.

  • Hi Matthew – thanks 🙂 My particular situation was not ideal. The CMS driving the site in question has no provisions for caching, and the site wasn’t designed with caching in mind (its a few years old now, and we’ve outgrown it a bit.) There was a sense of urgency involved, partially because the cpu cycles we were eating through was driving my hosting bill through the roof – and partially because I hadn’t set up our dev environment yet, so I didn’t have a place I could try implementing the libraries without potentially breaking the live site. I needed something fast, just to see how much of an impact caching would have, and whether it would be worth usin a more sophisticated method, or moving it off the cpu-cycle-billing environment. The quick and dirty way let me throw up a cache really quickly without breaking anything, so we could start to get some benchmarks up. I’ll be moving the site to a new CMS (EE or SS most likely, both of which have caching capabilities built in), but first and foremost I needed to find out if we’d still be screwed even if we cached everything.

    The audience of my blog tends to be less programmer, more developer, if you know what I mean. Many of my readers have never implemented any kind of cache before, or even considered doing it. Not that I would want to teach them poor practices (via quick and dirty) out of the gate, but the discussion – the logical walk-through of what to keep in mind – tends to be the tack I take on my posts. I won’t have the perfect solution for everyone’s individual situation (and I never try to claim I do), but I try to help show people where there may be something they might overlook, by using simple examples. Getting people to think things all the way through is what I try to accomplish here – that’s all this was 🙂

  • Hi Matthew – thanks 🙂 My particular situation was not ideal. The CMS driving the site in question has no provisions for caching, and the site wasn’t designed with caching in mind (its a few years old now, and we’ve outgrown it a bit.) There was a sense of urgency involved, partially because the cpu cycles we were eating through was driving my hosting bill through the roof – and partially because I hadn’t set up our dev environment yet, so I didn’t have a place I could try implementing the libraries without potentially breaking the live site. I needed something fast, just to see how much of an impact caching would have, and whether it would be worth usin a more sophisticated method, or moving it off the cpu-cycle-billing environment. The quick and dirty way let me throw up a cache really quickly without breaking anything, so we could start to get some benchmarks up. I’ll be moving the site to a new CMS (EE or SS most likely, both of which have caching capabilities built in), but first and foremost I needed to find out if we’d still be screwed even if we cached everything.

    The audience of my blog tends to be less programmer, more developer, if you know what I mean. Many of my readers have never implemented any kind of cache before, or even considered doing it. Not that I would want to teach them poor practices (via quick and dirty) out of the gate, but the discussion – the logical walk-through of what to keep in mind – tends to be the tack I take on my posts. I won’t have the perfect solution for everyone’s individual situation (and I never try to claim I do), but I try to help show people where there may be something they might overlook, by using simple examples. Getting people to think things all the way through is what I try to accomplish here – that’s all this was 🙂

  • Jackos

    Great article. Interesting and practical approach. I’m especially grateful that you pointed out potentially tricky situations. (still, found a lot of misspellings ;p)

  • Jackos

    Great article. Interesting and practical approach. I’m especially grateful that you pointed out potentially tricky situations. (still, found a lot of misspellings ;p)

  • signs into HTML entitiesx6, so I end up with if (($blah) &ampampampampampampampamp; ($blergh) {. After the 11th time if fixing them, I took a break for the evening. Will hit it again today, but was getting frustrated.

  • LOL yeah, so did I – but every time I try to edit the page, the freaking syntax highlighter plugin keeps converting all my && and <> signs into HTML entitiesx6, so I end up with if (($blah) &ampampampampampampampamp ($blergh) {. After the 11th time if fixing them, I took a break for the evening. Will hit it again today, but was getting frustrated.

  • Thanks for the article! As someone who is getting more into PHP programming both in and out of frameworks, it’s good to see how some of the caching systems of those frameworks work. There are always times where a small program of my own that doesn’t need an elaborate backbone to be functional may have a good use for simple caching, in which case I now know right where to begin. 🙂

    Keep up the writing!

    Eric Roberts’s last blog post..So just what are we up to?

  • Thanks for the article! As someone who is getting more into PHP programming both in and out of frameworks, it’s good to see how some of the caching systems of those frameworks work. There are always times where a small program of my own that doesn’t need an elaborate backbone to be functional may have a good use for simple caching, in which case I now know right where to begin. 🙂

    Keep up the writing!

    Eric Roberts’s last blog post..So just what are we up to?

  • Suraj kaushik

    hello, dear
    hee,,,,,,,,,heee,,,,,,,,,,.
    sorry, but below response is coming from your script.
    please tell me how can i fix this.
    i have about 8 links on the list-view.php, and all of them carry data from mysql.
    i want to create cache file for each link in cache folder………..
    any solutions..

    Warning: include(cache/) [function.include]: failed to open stream: No such file or directory in F:wampwwwnewprojectlist-view.php on line 7

    Warning: include() [function.include]: Failed opening ‘cache/’ for inclusion (include_path=’.;C:php5pear’) in F:wampwwwnewprojectlist-view.php on line 7

  • Suraj kaushik

    hello, dear
    hee,,,,,,,,,heee,,,,,,,,,,.
    sorry, but below response is coming from your script.
    please tell me how can i fix this.
    i have about 8 links on the list-view.php, and all of them carry data from mysql.
    i want to create cache file for each link in cache folder………..
    any solutions..

    Warning: include(cache/) [function.include]: failed to open stream: No such file or directory in F:wampwwwnewprojectlist-view.php on line 7

    Warning: include() [function.include]: Failed opening ‘cache/’ for inclusion (include_path=’.;C:php5pear’) in F:wampwwwnewprojectlist-view.php on line 7

  • Looks like a few things are happening here – first is that your cachefile name isn’t being passed to the script – notice its trying to include the directory ‘cache/’ instead of ‘cache/somefile.html’.

    Also, you’re obviously on a windows machine, so you’ll need to use windows file paths, not linux. So wherever you see ‘/’, you need to be using ”.

  • Looks like a few things are happening here – first is that your cachefile name isn’t being passed to the script – notice its trying to include the directory ‘cache/’ instead of ‘cache/somefile.html’.

    Also, you’re obviously on a windows machine, so you’ll need to use windows file paths, not linux. So wherever you see ‘/’, you need to be using ”.

  • Pingback: Tutorial: PEAR’s Chache_Lite | SteviesWebsite Blog()

  • Pingback: 30+ PHP Best Practices for Beginners : Webby Tutos()

  • Pingback: 30+ PHP Best Practices for Beginners | KolayOnline()

  • Pingback: 30+ PHP Best Practices for Beginners | huibit05.com()

  • Pingback: Zipsites Official Blog - Collection of Useful PHP Beginner Coding Tips()

  • Pingback: 30+ PHP Best Practices for Beginners | Beyond Venture Design()

  • Pingback: Blake Reynolds » Blog Archive » 32 Best PHP Pratcises.()

  • Jeff

    Best explanation and simple example of caching I’ve seen. Many comments miss the point, yes there are many packages that can do caching, but if you don’t understand how it works and implement a simple one…you’re asking for trouble in my experience.

  • Jeff

    Best explanation and simple example of caching I’ve seen. Many comments miss the point, yes there are many packages that can do caching, but if you don’t understand how it works and implement a simple one…you’re asking for trouble in my experience.

  • Steve

    This was great. I have one query from one server to a db out across the network that takes about 6 seconds. The content is only updated once or twice a week so this was perfect. I used this for just a small chunk of the page content and the “exit” in the script was killing the rest of the page processing. I switched it to an else statement and it was good to go.
    THANKS!

  • Steve

    This was great. I have one query from one server to a db out across the network that takes about 6 seconds. The content is only updated once or twice a week so this was perfect. I used this for just a small chunk of the page content and the “exit” in the script was killing the rest of the page processing. I switched it to an else statement and it was good to go.
    THANKS!

  • Jon

    Hi snipe. Great article, just implemented it on one section of my site. There's a problem though. My site uses user generated content. If a user creates a page, or comment or anything with the string <?php phpinfo() ?> for instance it is first served up as the plaintext string. All ok. But on reloading the page the include() function parses that text as a command and BOOM. Is there an alternative to include that will simply dump everything in the cachefile out to the browser, rather than processing it? Poo'd my pants when I realised.

  • Well, you could use something file_get_contents(), but really, if you're storing user-created data in a database (or anywhere) it would be smarter and more secure to make sure you're escaping the content using something like htmlspecialcharacters() or htmlentities():

    http://php.net/manual/en/function.htmlspecialch
    http://www.php.net/manual/en/function.htmlentit

    That will convert the <?php into <?php, which is not parsable.

  • Err… looks like Disqus converted my HTML code.. lol.. Anyway, it would turn the <?php into the HTML equivalent of <, and then the ?php, which PHP won't see as parsable PHP code.

  • Jon

    Me again, follow up. I should have really Googled before commenting because I've got the answer. Use readfile() in place of include(). This outputs whatever is in the file as-is, no processing is done on the content (which I'm guessing is also going to be marginally quicker).

    Using readfile() instead of include() also prevents parsing problems that I had, where certain unexpected ASCII characters where causing the error:

    Warning: Unexpected character in input: '' (ASCII=16) state=1 in /some/directory/yoursite/www.example.com/cache/cachefile.html on line 255

    readfile() sidesteps this problem and lets the browser deal with the character, not PHP.

    Sorry if I went on a bit there. In a bit, Jon.

  • That's definitely an option, but you should still consider escaping HTML and other potentially harmful scripting before outputting it to the browser. If you're storing this in a db, you might also consider using mysql_real_escape_string():
    http://php.net/manual/en/function.mysql-real-es

  • Nice tutor!!

  • Nice tutor!!

  • Pingback: how to control caching in php()

  • I took this idea one step further and made it into a re-usable script that can be included at the top of any .php to cache the output of that page: http://github.com/tlrobinson/cacheme.php

  • Pingback: CSS Minification on the Fly | Shiny Blog()

  • Really nice article, I've written a very similar script this morning.
    I am the developer of a news website that receives more than 100k visitors a day and for such websites caching is a must but has it's down side.
    The best solution for such websites is to use tight expiry time not more than 2 minutes max.

  • Pingback: On Demand CSS Minification Trick | Naked Trout()

  • mdelannoy

    a little enhancement for those interested:
    <?
    //begining of page
    // starting page load time
    $time_start = microtime(true);

    if($_SERVER['SERVER_NAME']== “localhost”)
    $cachedir = “development_server_cache”;
    else
    $cachedir = “production_server_cache”;

    $cachefile = $cachedir . $_SERVER['PHP_SELF'];

    if (
    file_exists($cachefile)
    &&
    (filemtime($_SERVER['DOCUMENT_ROOT'] . $_SERVER['PHP_SELF']) < filemtime($cachefile))
    &&
    !isset($_GET[“nocache”])
    ) {
    include($cachefile);
    echo “<!– Cached “.date('jS F Y H:i', filemtime($cachefile)).” –>n”;
    $time_end = microtime(true);
    $time = $time_end – $time_start;
    echo “<!– took me $time seconds to load page–>n”;
    exit;
    }
    ob_start(); // start the output buffer
    ?>
    <?
    //end of cache
    $time_end = microtime(true);
    $time = $time_end – $time_start;
    echo “<!– took me $time seconds to server uncached page –>n”;

    if(!file_exists(dirname($cachefile)))
    mkdir(dirname($cachefile), 0777, TRUE);

    $fp = fopen($cachefile, 'w'); // open the cache file for writing
    fwrite($fp, ob_get_contents()); // save the contents of output buffer to the file
    fclose($fp); // close the file
    ob_end_flush(); // Send the output to the browser
    ?>

  • xcoder

    I build the class >> http://www.copypastecode.com/20949/

    All you need to is include it in cache::load() in the page top and cache::create() in page bottom =)

  • Pingback: PHP Caching Method()

  • Pingback: 30+ PHP Best Practices for Beginners | php Snake Portfolio()

  • Pingback: PHP Resources | The Michaeldon Roareth! - Mad rantings of an undiscovered dinosaur()

  • Aussiedude

    Just incase it helps anyone, I’ve been doing my homework on this for a while now, so I can create an effective template/cache system for my site I’m writing right now (in PHP from scratch, it’s really more of a personal learning experience, teaching myself PHP while making a site at the same time).

    I’ve designed my template system to build the page from small templated components that are combined together to form the whole page. Such as one template for how to display dialog messages, comments, profiles, menu’s, etc.

    That way, if the page being requested was requested before, and the output is exactly the same for both, it’ll use a cached version of the entire page.

    However, if not, it’ll use cached versions of certain parts of the page, and for the rest it’ll just make it up as it’s needed.

    (However the downside to this is, for whole page caching situations, it’ll still need to build the little components first before it builds the entire thing, but at least it’ll be making the little components from the cache. Optimally, it’d be best if it could go just straight to the full page cache, but it’s better than nothing for now until I can find a solution for that.)

    My templates just use PHP and the output buffer, I didn’t write my own tag system or anything, so all my templates are just php files in a template folder I load using my website’s main “engine” (if it could be called that?). The variables for the template are just based through in a single associative array.

    Which makes it easy to tell if a cached version of the template can be used or not. I’ve got a folder called ‘cache’, where versions of each template are stored, with the name of the php file, followed by a short hash of the array’s contents, and unix timestamp.

    Example of a cache file for a template called ‘poll-view.php’:
    poll-view.d81cfdaaa2.1281634990.php

    (Format: file-name.10-character-hash.unix-time-stamp.php)

    If the same array values are passed to the same template, the output will be the same every time. Hence, if two hashes of the array are the same then it’s highly likely the two hashed arrays have the same contents. (As for hashing the array, I just implode it, use md5, then grab the first 10 characters).

    As for how long to keep it for, I also included as you can see, a unix time stamp in the cache files name. When a new cache file needs to be added, a check is done to see if the folder has too much in it, if so, the oldest cache files are removed, until there is enough space.

    Though I haven’t finished that part of the code. Not sure if I’ll make it remove a template when the folder is above a certain number of MB’s, or just based on file count. MB’s would be better I’m guessing, but it depends on how easy it is to do that over counting the number of files in the folder.

    Even still, that does mean for each template component, it’s going to do a fair bit of work to add that cache file. So I’m thinking that might be something for a cron job to do..

    The only real benefit I see myself getting from it all is if I store a lot of cached templates, and if most of the requests use existing cached versions of the templates. If not, I think it might actually make my website slower. But I’m designing it to have an on/off feature, so I can experiment, see if it speeds up the website any, and if so, I’ll use it.

    So, if you’re currently making a website that needs caching right now, hope that helps! :3

    (Sorry for the length of the post and the typos)

  • Cacheing and reading from PHP is not bad, but using more advanced methods and going around PHP can bring you a much better performance. I did a benchmark on my blog there I compared reading cache from PHP and reading cache from Apache, you can read more about it here:
    http://sven.webiny.com/advanced-cache-mechanism-using-php-cpp-and-apache/

  • Anonymous

    Wow, you were inspired a little too much from the original article it seems;

    http://www.theukwebdesigncompany.com/articles/php-caching.php

    • Dear “stop@stealing.com” – I’ve never seen that article before, and the two are nothing like each other. Do they cover the same things? Absolutely – it’s a freaking caching article – output buffering is sort of going to come up, and it’s going to be handled the same way every time. All ob caching articles are going to cover similar things because it’s done the same way every time. Duh.

  • Jeffrey Marx

    Hey Snipe, you said that it’s a good idea to cache parts of a php page. This is something I need to do. Although I can’t seem to get it to work. every time I try to put content below the fopen-to-ob_end_flush portion, that only shows up when I delete the cached file. It shows up once, and then disappears until  delete the cache again. So in a nutshell, I can put code above all the cached stuff, but not below it.
    Would you happen to have any thoughts on this? I would imagine it has something to do with the output buffering? If you or anyone else could point me in the right direction for a solution, i would really appreciate it. Thanks!

  • Awesome. I’ll take the code 🙂

  • Youwontgetmyemail

    Hey, tried it – really cool!
    I have an addition to your list of caching libraries: Zoocache is very easy to integrate into an application and you can choose a method to store your contents in. Also nice: It provides a blacklist using RegExp and you can write your own function to generate the storage key. This makes it possible to deliver different versions of the page for e.g. different webbrowsers! Hosted at: http://gihub.com/marcelklehr/zoocache.php

    • Great, I’ll definitely check that out, thanks!

      Side note, I really hate when people use fake email addresses to leave comments. If you irrationally think I’m going to do something bad with your email address, you shouldn’t bother posting here, since I’m obviously a piece of shit that doesn’t deserve your time – or use one of the oauth options like twitter, so I never even see your email. I find using a fake one pointless and disrespectful. And more frustrating, you won’t even have the option of being notified of this reply.

  • Great tutorial, I been trying to implement a nice simple cache system server side. Seems your the only developer that bothers to explain how to implement. 

  • Jordan Moore

    Does my job perfectly. Had an extremely simple script which pulled some (potentially thousands) of entries out of a database at a time, but the database wasn’t frequently updated. With a bit of tweaking I’ll be able to serve a cache 100% of the time and only update the cache when the database is updated – thanks!

  • Hello there!

    Thanks for making this idea so accessible.   For some reason, even though I knew this could be done, the application of it for simple caching never even popped into my head!   Simple.  Obvious.  Easy.

    The quoting in your code example has been mangled (you asked us to mention if that happened again) but other than that, it all went well.

    I used this as a starting point and then built automatic directory structure creation into it.   No fun having 600,000+ files in one directory.

    Ended up using readfile() instead of include, and noticed a very sizable performance gain, at least in my setup.  Your concern about escaping the file could be mitigated by the fact that everything that is being output into the file should already be sanitized when written.   Upon re-insertion of that same pre-sanitized data, everything should be ok.

    I also applied this to cache only certain parts of a script’s output.  This way the heavy lifting portion is cached on some pages that have a lot of per-user customization,  without having to cache a different version for every user.

    Thanks again for pointing me in the right direction.

  • jeev

    Been looking over the entire internet for a clear explination like this one! Kudoos!

  • João Verissimo Ribeiro
  • kingshuk

    how can i remove header from being cached ??

By snipe
Snipe.Net Geeky, sweary things.

About Me

I’m a tech geek/dev/infosec-nerd/scuba diver/blacksmith/sword-fighter/crime fighter/ENTP/warcrafter/activist. I run Grokability, Inc, and run several open source projects, including Snipe-IT Asset Management. Tweet at me @snipeyhead or read more...

Get in Touch