2010
I just recently moved my company’s web infrastructure out from our colo facility and onto Amazon Web Services (mainly EC2). Hopefully I’ll have a writeup on the whole process soon. We started running a couple of heavy ad campaigns and it severely loaded the servers. At the time I had our Memcached setup disabled while debugging the kinks out from the migration.
With the large influx of traffic it was time to setup Memcached again. This gave me a chance to re-examine how it was done. Previously, we would cache various blocks of a given page. Depending on the situation, this can be a good solution if you needed a combination of cached content and dynamic output. However, this method is a little dirty since you have to modify your existing source code.
Here’s a solution that I’ve come up with. Basically, you scrape the entire HTML output and store that in Memcache. I think its very clean and non-invasive. You don’t have to modify the existing code at all.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | <? //////////////////////////////////////////////////////////////// // index.php // // This is an example of memcaching a full static page //////////////////////////////////////////////////////////////// // This is the page you want to cache $url = "http://www.mysite/mydynamicpage"; // This function grabs the HTML of the page function ScrapePage () { global $url; $ch = curl_init(); // initialize curl handle curl_setopt($ch, CURLOPT_URL,$url); // set url to post to curl_setopt($ch, CURLOPT_FAILONERROR, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);// allow redirects curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return into a variable curl_setopt($ch, CURLOPT_TIMEOUT, 3); // times out after 4s curl_setopt($ch, CURLOPT_POST, 1); // set POST method curl_setopt($ch, CURLOPT_POSTFIELDS, ""); // add POST fields $result = curl_exec($ch); // run the whole process curl_close($ch); return $result; } // This function stores the HTML into memcache and update if its older than 3 mins function MemCacheFunction($function_name) { $memcache = new Memcache; $memcache->connect('localhost', 11211) or die ("Could not connect"); $memcache_key = md5('sometextkey'.$function_name); if ( $memcache_result = $memcache->get($memcache_key) ) { //echo "It Worked!"; return $memcache_result; } //echo "Couldn't Find Key: ".$memcache_key; $ret = ''; $ret .= $function_name(); $memcache->set($memcache_key, $ret, false, 180); return $ret; } // A simple condition determines whether to load page from Memcache or process the dynamic page if ($_COOKIE["loggedin"] == "yes") { include ("index_dynamic.php"); } else { echo MemCacheFunction("ScrapePage"); } ?> |
Typically usage would be for heavy traffic pages such as the homepage of a website. You simply rename your index.php page to something like index_dynamic.php and then use the code below as your original index page. In the code you will have to specify the dynamic page you want to cache and at the bottom a condition to whether load from memcache or the actual page. This part is important. For example, you would only want to show the cache for general traffic such as spiders and visitors who are not logged in.
If you need a good breakdown on installing and configuring Memcached, check this out.

No Comment.
Add Your Comment