Blog
Site Search and CDNs
What is a CDN?
A Content Delivery Network (CDN) facilitates speedy delivery of static content over the internet. The general idea of a CDN is to have a number of Point of Presence servers (PoPs), often called edge nodes or edge servers, distributed across different regions, responsible for delivering static and/or streaming content to users from the nearest possible PoP.
PoPs should be synchronized with related origin servers in order to deliver up-to-date content. Thanks to anycast technology (communication between a single sender and the nearest of several receivers in a group), when a request to fetch a specific resource is sent to the CDN, the best PoP is automatically selected and the requested resource is delivered. This way the requested content is delivered quickly, avoiding any unnecessary network delays. Instead of one or two servers handling all the incoming requests, the responsibility of delivering content is distributed among the PoPs.
(Image taken from labs.ripe.net (courtesy: google images)
Why Klevu uses CDNs?
When a client installs Klevu on their website, Klevu adds a small javascript (JS) to the merchant’s website. This JS is responsible for pulling other Klevu javascripts, CSS files, and images required to enable Klevu’s search functionality. Those accessing a website with Klevu installed automatically access these static resources. By using a CDN, Klevu’s static content such as javascripts, CSS, and images are all quickly delivered when required. According to the statistics gathered over a recent 30 day period, over 97% of requests to the static content were served using a CDN.
Klevu also uses a CDN to serve its search results. When a shopper searches for a term on a website, certain computations are performed on Klevu servers to collect results. If the same query is fired by other shoppers, it does not make sense for the servers to perform the same computations again (unless the underlying catalog has changed). It is better that the results are served directly from cache. In such cases, the CDN acts as a geographically distributed cache for the search results. Our data shows that over 40% of search queries are being served from the CDN.
Synchronization of CDN resources
It is very important that the results delivered to a shopper from a CDN are synchronized with the resources on the origin servers. When a merchant installs Klevu search on their website, they are given access to the Klevu Merchant Centre (KMC). The KMC provides merchants with the ability to customize elements of their search, such as the look and feel of the search results GUI. Merchants can enable or disable certain parts of the GUI and make changes to the CSS associated with the search results GUI.
Once changes are made and saved, these updates need to be reflected on the merchant’s website ASAP. If the resources are not continually synchronized with the PoPs, an outdated version of the content will be served.
There are two models to keep data in synchronization: Push model and Pull model.
Push model – It is the responsibility of the origin servers to push the updated content to the PoPs associated with the CDN (usually with rsync over ssh).
Pull model – PoPs are responsible for fetching the updated content from the origin servers. The frequency that PoPs should contact the origin servers is configured from the CDN control panel. If resources need refreshing before the scheduled time, it is usually possible to purge specific URLs or request the CDN clear out zones containing a set of resources.
CDN failover
Since a CDN is backed up with many PoPs, it is very unlikely to suffer a failure. Usually, if one PoP has failed, the other one would take over and continue to deliver the content. Most CDN providers offer an SLA of 99.9% and some even guarantee 100% (for a price!).
As with anything in life, nothing is infallible. There are scenarios when CDNs do fail!
- The origin servers go down
When a resource is uploaded to a CDN, the expiry of that resource is specified. For example, if for javascripts, TTL is 10 minutes, all the PoPs would contact the origin server, every 10 minutes, to fetch copies of the modified javascripts.
Unless PoPs are configured to serve the “stale” content, in the event of the origin server going down, the CDN would fail to deliver the requested resource beyond its set TTL.Such a scenario can be handled in two ways:- By ensuring failovers for the origin servers
- By choosing a CDN provider that continues to serve stale content until the origin server becomes reachable again.
- Network outages
If a group of PoPs responsible for serving requests from a specific region are not reachable, requests to the CDN from those regions will result in failed requests.Unless the CDN provider has failovers in place across all regions, it is very unlikely you can do anything other than establishing your own failover scenario.
In our experience, setting up failover across two CDNs is the safest bet to achieve 100% uptime. It is very very unlikely that two CDNs will go down at the same time. Additionally since most CDN providers charge you for the bandwidth used, it is not an expensive affair to have two CDNs configured.
At Klevu, we have contracts with two industry leading CDN providers that offer SLAs of 99.9998% and 99.9% respectively. Having setup a failover across two CDNs literally means 100% SLA.
Load balancing for origin servers
We maintain two dedicated servers hosted in different countries that are connected with a high bandwidth load balancer. The load balancer serves as the origin server to both CDNs. Such a setup ensures that if one of the backend servers is down for any reason, the other one will continue to serve data to the CDN providers. Additionally, both our CDN providers will serve stale content if for some reason they cannot contact the origin servers.
Automatic failover between two CDNs
Since our DNS lookups are managed via a highly scalable and available Amazon Route 53 web service, we have set up a healthcheck to access a resource via the first CDN provider. This check issues an alarm if the requested resource is not accessible (i.e. something is wrong with the first CDN provider).
To manage both CDN accesses through a single domain, we have two record sets each pointing to different CDN domains. One of them is marked as the primary record set directing all the traffic to the first CDN. The second record set is marked as the secondary, and is used only when the healthcheck has raised an alarm. The time-to-live (TTL) of both record sets are set at the minimum to ensure a quick swap between the two CDNs in the event of an emergency.
The health check is never deleted, which means it continues to look for the resource through the first CDN (even after the alarm raised by the health check). Should the resource (i.e. the first CDN) become available again, the Amazon Route 53 starts serving the requests through the primary record set automatically.
Conclusion
Serving accurate content quickly is essential for any good customer experience. For online retailers this applies to images, product pages, content and search results. It is for this reason that Klevu goes to great lengths to ensure that we have fast and the most secure content delivery system. If a merchant’s search results take too long, the session times out, or outdated content is delivered, the shopper is likely to leave and go elsewhere. Klevu clients can rest assured that this will not happen with their site search.
¹See https://klevu.com/blog/whats-behind-site-search-speed/ for more information on what’s behind site search speed.