深瞳

夜如深瞳,瞳深如夜

  :: :: 博问 :: 闪存 :: :: 联系 :: 订阅 订阅 :: 管理 ::

Chapter 9. Caching

The distributed nature of the Web provides many opportunities for performance improvement through caching. In general, caching is the temporary storage of state for faster retrieval. Caching for Web applications can occur on the client (browser caching), on a server between the client and the Web server (proxy caching), and on the Web server itself (page caching or data caching). Both browser caching and proxy caching reduce Web server traffic by serving content either directly from the client's machine or from an intermediate proxy server, and are thus not directly managed by ASP.NET (although your ASP.NET pages can specify browser and proxy caching options by adding the appropriate metatags, Cache-Control headers, and Expires headers). Page caching and data caching, however are directly applicable to ASP.NET and should be used, at least to some extent, in any Web application built with ASP.NET.

9.1 Caching Opportunities in ASP.NET

ASP.NET provides support for page, partial page, and data caching. Caching a page that is dynamically generated, called page output caching, improves performance by generating the page dynamically only the first time it is accessed. Any subsequent access to the same page will be returned from the cache, saving the time it would have taken to dynamically generate the page. The expiration of a cached page must be explicitly set, after which time the page will be regenerated and recached the next time it is accessed. ASP.NET also supports the ability to cache portions of a page if those portions are encapsulated into a user control.

The other opportunity for performance improvement is to reduce the number of round-trips made to a back-end data server (or even to a local database). Instead of always requesting live data from a data source, you can cache that data in memory and access it locally. Data caching can cause cache coherency problems, but when used correctly, it can dramatically improve application responsiveness. ASP.NET provides a full-featured data cache engine, complete with support for scavenging, expiration, and file and key dependencies. Figure 9-1 shows the two locations where caching can be used to improve performance in ASP.NET applications.

Figure 9-1. Caching Opportunities in ASP.NET

9.2 Output Caching

For pages whose content is relatively static, it is inefficient to regenerate the page for every client request. Instead, pages can be generated once and then cached for subsequent fetches. The OutputCache directive can be added to any ASP.NET page, specifying the duration (in seconds) that the page should be cached. The code shown in Listing 9-1 is an example of using the OutputCache directive to specify that this particular page should be cached for one hour after its first access. In this example, the page prints the date on which it was generated, so if you try accessing this page, you will notice that after the first hit, all subsequent accesses will have the same timestamp until the duration is reached.

Listing 9-1 OutputCache Directive Example
<%@ Page Language="C#" %>
<%@ OutputCache Duration='3600' VaryByParam='none' %>
<html>

  <script runat="server">
    protected void Page_Load(Object sender, EventArgs e) {
           _msg.Text = DateTime.Now.ToString();
    }
  </script>

  <body>
    <h3>Output Cache example</h3>
    <p>Last generated on:
       <asp:label id="_msg" runat="server"/></p>
  </body>

</html>

When using the OutputCache directive, you must specify at least the Duration and VaryByParam attributes. Leaving the VaryByParam attribute set to 'none', as shown in Listing 9-1, means that one copy of the page will be cached for each request type (GET, HEAD, or POST). Subsequent requests of the same type will be served a cached response for that type of request. Table 9-1 shows the complete set of attributes available with the OutputCache directive.

Table 9-1. OutputCache Directive Attributes

OutputCache Attribute

Values

Description

Duration

Number

Time, in seconds, that the page or user control is cached

Location

'Any'

'Client'

'Downstream'

'Server'

'None'

Controls the header and metatags sent to clients indicating where this page can be cached. Choosing 'Any' means that the page can be cached on the browser client, a downstream server, or the server. 'Client' means that the page will be cached on the client browser only. 'Downstream' means that the page will be cached on a downstream server and the client. 'Server' means that the page will be cached on the server only. 'None' disables output caching for this page.

VaryByCustom

'Browser' Custom string

Vary the output cache either by browser name and version or by a custom string, which must be handled in an overridden version of GetVaryByCustomString().

VaryByHeader

'*' Header names

A semicolon-separated list of strings representing headers submitted by a client.

VaryByParam

'none' '*' Parameter name

A semicolon-separated list of strings representing query string values in a GET request or variables in a POST request. This is a required attribute.

VaryByControl

Control name

A semicolon-separated list of strings representing properties of a user control used to vary the output cache (applicable to user controls only).

The attributes that you specify in an OutputCache directive are used to populate an instance of the System.Web.HttpCachePolicy class by calling the System.Web.UI.Page.InitOutputCache() method. This class is accessible programmatically through the Response property of the Page (or Context) class, as shown in Listing 9-2.

Listing 9-2 HttpCachePolicy Class
public sealed class HttpCachePolicy
{
  public HttpCacheVaryByHeaders VaryByHeaders {get;}
  public HttpCacheVaryByParams VaryByParams {get;}
  public void AppendCacheExtension(string extension);
  public void SetCacheability(
                    HttpCacheability cacheability);
  public void SetExpires(DateTime date);
  public void SetLastModified(DateTime date);
  public void SetMaxAge(TimeSpan delta);
  public void SetNoServerCaching();
  public void SetSlidingExpiration(bool slide);
  //...
}

public sealed class HttpResponse
{
  public HttpCachePolicy Cache {get;}
  //...
}

public class Page : ...
{
  public HttpResponse Response {get;}
  //...
}

The OutputCache directive gives you access to a subset of the functionality available in the HttpCachePolicy class. One useful feature that is only accessible programmatically is the ability to set a sliding expiration on a page. That is, whenever a page is hit, the timeout is reset. This is a useful way to ensure that only items that are being used are kept in your cache. Pages that are cached once and then never accessed again are a waste of resources. The code in Listing 9-3 shows an example of a page whose expiration is set programmatically and uses the sliding expiration scheme.

Listing 9-3 Programmatically Setting Page Caching
<%@ Page Language="C#" %>
<html>
  <script runat="server">
    void Page_Load(Object sender, EventArgs e) {
      Response.Cache.SetExpires(DateTime.Now.AddSeconds(360));
      Response.Cache.SetCacheability(
                   HttpCacheability.Public);
      Response.Cache.SetSlidingExpiration(true);
      _msg.Text = DateTime.Now.ToString();
    }
  </script>

  <body>
    <h3>Output Cache example</font></h3>
    <p>Last generated on:
              <asp:label id="_msg" runat="server"/>
  </body>
</html>

9.2.1 Output Caching Location

So far, we have discussed the advantage of output caching on the server, where it saves server processing time by loading the page from a cached rendering stored in the ASP.NET worker process instead of dynamically generating it. In addition to server caching, there are two other opportunities for page caching. First, many browsers can cache pages on the client machine. This is the most efficient method of all because it avoids any network traffic and renders the page directly from the client machine's cache. Web pages indicate that they should be cached in client browsers through the Expires header of their HTTP response, indicating the date and time after which the page should be retrieved from the server again. Second, the HTTP 1.1 protocol supports the caching of responses on transparent proxy servers, sitting between the client and the server. Pages can indicate whether they should be cached on a proxy by using the Cache-Control header.

If your page is already output cacheable, it usually makes sense to make that page client and proxy cacheable too. It turns out that the OutputCache directive on a page enables all three types of caching梥erver, client, and proxy梑y default. This means that when you mark a page with an OuputCache directive, you are effectively saying that this page will not change for a specific period of time, and if it is possible to cache it anywhere in the pipeline between your ASP.NET application and the client browser, please do so. This is useful because with one statement, you can advertise the cache friendliness of your page, specifying the expiration time only once, and let ASP.NET render your page appropriately to whatever client asks for it.

On the other hand, sometimes you might need more precise control over exactly where your page is cached. The Location attribute of the OutputCache directive lets you specify where you want your page to be cached. Table 9-2 shows the values of the Location attribute and how they affect the Cache-Control header, the Expires header, and the server caching of your page.

Table 9-2. Effect of the Location Attribute in Output Caching

Value of Location

Cache-Control Header

Expires Header

Page Cached on Server

'Any'

public

Yes

Yes

'Client'

private

Yes

No

'Downstream'

public

Yes

No

'Server'

no-cache

No

Yes

'None'

no-cache

No

No

For example, if you specified a value of 'Client' for the Location attribute of an OutputCache directive on a page, the page would not be saved in the server cache, but the response would include a Cache-Control header value of private and an Expires header with a timestamp set to the time indicated by the Duration attribute, as shown in Listing 9-4.

Listing 9-4 Designating Private Caching
<%@ OutputCache Duration='120' Location='Client'
                VaryByParam='none' %>
...
棗棗�- generates the following response 棗棗棗
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.1
Date: Tue, 01 Jan 2002 12:00:00 GMT
Cache-Control: private
Expires: Tue, 01 Jan 2002 12:02:00 GMT
...

9.2.2 Caching Multiple Versions of a Page

Users can request pages in a Web application in several ways. They can issue a plain GET request, a plain HEAD request, a GET request with an accompanying query string with name/value pairs appended, or a POST request with an accompanying body containing name/value pairs. Caching pages that are retrieved using only a GET request with no query string is straightforward, because the page never changes its contents based on the request (except possibly based on client headers, which we will come back to). Caching pages that are accessed with changing query strings or POST variable values becomes more complex, because a distinct version of the page must be cached for each unique query string or variable combination that is submitted.

Before you decide to enable output caching on an ASP.NET page, you must decide how many versions of that page should be cached. The options are to cache only one copy of the page for each request type (GET, HEAD, or POST); to cache all GET, HEAD, and POST requests (implying separate cached versions of the page for each request); or to cache multiple versions of a page only if a particular variable in a GET or POST changes. This option is controlled through the VaryByParam attribute of the OutputCache directive, whose values are shown in Table 9-3.

If you set the VaryByParam attribute to 'none', only one version of the page is stored in the output cache for each request type. If a user issues a GET request to a page with an accompanying query string, the output cache ignores the query string and returns the single cached instance of the page for GET requests. If, on the other hand, you set the VaryByParam attribute to '*', a new version of that page is cached for each unique query string and each unique collection of POST variables across all client requests. This setting is potentially very inefficient and must be used carefully. For example, suppose a page that accepted a person's name in a query string were marked with the OutputCache directive and specified a VaryByParam value of '*'. For each client request with a different name, a new copy of the page would be stored in the output cache. Unless many people with the same name hit that page, there would likely be very few cache hits, and the cached pages would just be wasting server memory. This scenario is depicted in Figure 9-2.

Figure 9-2. Caching Multiple Copies of a Page

Table 9-3. VaryByParam Values

VaryByParam Value

Description

'none'

One version of page cached (only raw GET or HEAD)

'*'

N versions of page cached based on query string and/or POST body

v1

N versions of page cached based on value of v1 variable in query string or POST body

v1, v2

N versions of page cached based on value of v1 and v2 variables in query string or POST body

The VaryByParam attribute can also be set to the name, or list of names, of query string or POST variables. The decision of whether to create a unique entry in the output cache for a page is then based on whether the particular variable (or variables) listed change from one request to another. It is unlikely this capability would be used very often, because query string and POST variables are typically used when you are deciding how to render a page, or at the very least, to store in some back-end data source when the page is posted.

In addition to caching different versions of a page based on the parameters passed by a client request, you can cache different versions of a page for a variety of other reasons. The VaryByHeader attribute of the OutputCache directive caches a different version of a page whenever a header string (or set of header strings, which you can specify) differs from one client to the next. This is important if you render your page differently based on the headers supplied by the client (which happens implicitly with many ASP.NET controls). For example, if you conditionally render portions of your page based on the Accept-Language header passed in by clients, you need to make sure that a separate cache entry is made for each language that clients request. The page in Listing 9-5 prints a message in the client's preferred language (as long as it is French, German, or English). If we applied the OutputCache directive to this page without a VaryByHeader constraint, the first client to request it would see his preferred language, but subsequent clients would see the first client's preferred language until the duration expired. Using the VaryByHeader constraint with Accept-Language as a value causes a distinct rendering of this page to be stored in the output cache for each client request with a unique language preference.

Listing 9-5 Using VaryByHeader
<!� File: LanguagePage.aspx �>

<%@ Page language='C#' %>
<%@ OutputCache Location='any'
                VaryByParam='none'
                Duration='120'
                VaryByHeader='Accept-Language' %>
<html>
  <head>
  <script runat="server">
    protected void Page_Load(Object src, EventArgs e)
    {
      if (!IsPostBack)
       {
          switch (Request.UserLanguages[0])
          {
            case "fr":
              _msg.Text = "Bonjour!  Comment allez-vous?";
              break;
            case "de":
              _msg.Text = "Guten Tag!  Wie geht's?";
              break;
            default:
              _msg.Text = "Hello!  How are you?";
              break;
          }
       }
       Response.Write(DateTime.Now.ToString());
    }

  </script>
  </head>
  <body>
  <form runat=server>
    <asp:Label id='_msg' runat=server />
  </form>
  </body>
</html>

Finally, you can cache separate page renderings based on the browser type and version, or any other criteria you need, through the VaryByCustom attribute. If you know that a page may render differently for different browsers, it is important that you store a separate cache instance for each browser type that accesses the page. Setting the VaryByCustom attribute of the OutputCache directive to Browser causes a unique instance of the page to be cached for each browser type and major version number that accesses your page. Note that this is different from using the VaryByHeader option with a value of User-Agent because that would store a unique instance in the cache for each user agent string, which would generate many more entries. It is important to realize that many server-side controls render themselves differently based on the browser type and version, including the Calendar, TreeView, Toolbar, TabStrip, and MultiPage controls, to name a few. If you use any of these controls in a page on which you have enabled output caching, you should be sure to include a VaryByCustom attribute set to 'Browser', as shown in Listing 9-6.

Listing 9-6 Using VaryByCustom Set to 'Browser'
<%@ Page Language='C#' %>
<%@ OutputCache Location='Any'
                VaryByParam='none'
                Duration='120'
                VaryByCustom='Browser' %>
<html>
<body>
  <form runat=server>
    <asp:Calendar id='_cal' runat='server' />
  </form>
</body>
</html>

If you render your page conditionally based on any other factor, you can use the VaryByCustom attribute in conjunction with an overridden implementation of HttpApplication.GetVaryByCustomString in your application class. The purpose of this function is to take the string value of the VaryByCustom attribute as a parameter and return a string that is unique with respect to some aspect of the page, request, or application. In most cases, the implementation of GetVaryByCustomString checks some value in the current HttpBrowserCapabilities class and returns a unique string based on that value.

For example, suppose that you have built a page that renders differently based on the client browser's level of table support. You might provide an overridden version of GetVaryByCustomString, as shown in Listing 9-7.

Listing 9-7 GetVaryByCustomString Implementation
<!� File: global.asax �>
<%@Application language='C#' %>

<script runat=server>
public override string
  GetVaryByCustomString(HttpContext ctx, string arg)
{
  switch (arg)
  {
     case "Tables":
       return "Tables=" + ctx.Request.Browser.Tables;
     default:
         return "";
  }
}
</script>

This implementation would return a string value of "Tables=true" for client browsers that supported tables and "Tables=false" for client browsers that did not. This string would then be appended onto the other OutputCache distinguishing strings and used to index the output cache to store and retrieve renderings of this page. An example of a page that used this VaryByCustom attribute is shown in Listing 9-8.

Listing 9-8 Using VaryByCustom in a Page
<%@ Page language='C#' %>
<%@ OutputCache Location='any'
                VaryByParam='none'
                Duration='120'
                VaryByCustom='Tables' %>
<html>
  <head>
  <script runat="server">
    protected void Page_Load(Object src, EventArgs e)
    {
       if (Request.Browser.Tables)
          // render with tables
       else
          // render without tables
    }
  </script>
  </head>
  ...

In general, when you add output caching to a page, it is important to ask yourself if this page will render itself differently in different conditions (different client properties, different times of day, and so on) and make sure you compensate for that by indexing the output cache uniquely for all those different rendering possibilities.

9.2.3 Page Fragment Caching

Even more common than entire pages that change infrequently are portions of pages that change infrequently. For example, there are often navigation bars, menus, or headers that are common to many pages in an application and that change infrequently (especially not between different client requests), which makes them ideal for caching. Fortunately, ASP.NET provides a mechanism for caching portions of pages, called page fragment caching. To cache a portion of a page, you must first encapsulate the portion of the page you want to cache into a user control. In the user control source file, add an OutputCache directive specifying the Duration and VaryByParam attributes. When that user control is loaded into a page at runtime, it is cached, and all subsequent pages that reference that same user control will retrieve it from the cache, thus improving throughput. The user control shown in Listing 9-9 specifies output caching for 60 seconds.

Listing 9-9 Specifying Page Fragment Caching in a User Control
<!� File: MyUserControl.ascx �>

<%@ OutputCache Duration='60'
                VaryByParam='none' %>
<%@ Control Language='C#' %>

<script runat=server>
  protected void Page_Load(Object src, EventArgs e)
  {
     _date.Text = "User control generated at " +
                   DateTime.Now.ToString();
  }
</script>
<asp:Label id='_date' runat='server' />

In the sample client page shown in Listing 9-10, the page itself is not output-cached, but the user control embedded in it is. In this example, because both the page and the control it embeds print the time at which they were generated, you will see a discrepancy between the printed times as the page is refreshed and the control is drawn from the cache.

Listing 9-10 Cached User Control Client
<!� File: UserControlClient.aspx �>

<%@ Page Language='C#' %>
<%@ Register TagPrefix='DM' TagName='UserCtrl'
             Src='MyUserControl.ascx' %>
<html>
<head>
<script runat='server'>
  protected void Page_Load(Object src, EventArgs e)
  {
     _pageDate.Text = "Page generated at " +
                       DateTime.Now.ToString();
  }
</script>
</head>
<body>
<form runat='server'>
  <DM:UserCtrl runat='server'/>
  <br/>
  <asp:Label id='_pageDate' runat='server' />
</form>
</body>
</html>

User controls also can change their rendering based on the type of request the control is responding to or perhaps based on properties exposed by a control. It is important to determine the circumstances under which the contents of a user control will change before you apply the OutputCache directive to it. There are three ways of indicating that a distinct cache entry is required for a user control caching.

  1. You can include a VaryByParam attribute to include different cache entries based on the parameters of the current POST.

  2. You can include a VaryByControl attribute to cache different entries based on programmatic values of controls embedded in the user control (such as a combo box selecting some appearance aspect of the control).

  3. User controls will automatically be cached in different entries if the user control is instantiated in a page with properties specified in the tag.

The first of these three options is probably the least likely to be useful, since user controls are typically used from several different pages, whose POST variables will be different. It can be complicated to correctly identify the variables to vary by because of the way the parameters are parsed and sent to user controls (they are scoped by the control name).

The second of the three options was added to simplify the process of identifying which parameters should determine unique cache entries for your user control. Instead of referring to POST or GET variables directly, your user control can specify which of its child controls should affect its cache entry. For example, if you built a user control that changed its rendering based on the value of a drop-down list, you would want to be sure that there was a unique entry for every value of that drop-down list. By specifying the drop-down list in the VaryByControl attribute, you ensure that a unique cache entry will be stored for each value selected in the list. The user control shown in Listing 9-11 demonstrates this.

Listing 9-11 Specifying VaryByControl in a User Control
<!� File MyUserControl.ascx �>
<%@ OutputCache Duration='120'
                VaryByControl='_favoriteColor' %>
<%@ Control Language='C#' %>

<p>Select your favorite color</p>
<asp:DropDownList AutoPostBack='true' id='_favoriteColor'
                  runat='server'>
     <asp:ListItem>red</asp:ListItem>
     <asp:ListItem>green</asp:ListItem>
     <asp:ListItem>blue</asp:ListItem>
</asp:DropDownList>
<p>Here it is!</p>
<span
   style='width:50;background-color:
<%=_favoriteColor.SelectedItem%>'>
</span>

The third option for uniquely specifying cache entries for user controls is to expose public properties. There is nothing special you have to do to enable this except to expose public properties and set the property values in the user control creation. For example, if we had a user control that exposed a single public property called FavoriteColor, adding an output cache directive to the control would cache separate versions of the control based on the value of that property on creation. A sample user control that does this is shown in Listing 9-12, and a sample client is shown in Listing 9-13.

Listing 9-12 Specifying Unique Cache Entries by Exposing a Public Property
<!� File MyUserControl.ascx �>
<%@ OutputCache Duration='120' VaryByParam='none' %>
<%@ Control Language='C#' %>

<script runat='server'>
private string _color;
public string FavoriteColor
{
     get { return _color; }
     set { _color = value; }
}
</script>

<p>Here is your favorite color:</p>
<span style='width:50;background-color:<%=_color%>'>
</span>
Listing 9-13 Client to Cached User Control with Public Property
<%@ Page Language='C#' %>
<%@ Register TagPrefix='DM' TagName='UserCtrl'
             Src='MyUserControl.ascx' %>
<html>
<body>
<form runat='server'>
  <DM:UserCtrl FavoriteColor='green' runat='server'/>
</form>
</body>
</html>

9.2.4 Output Caching Considerations and Guidelines

As we have seen, you have many options to consider when enabling output caching for a page. It is important to balance the estimated increase in throughput with the additional overhead of saving one or more renderings of a page in memory. While this trade-off is not easy to calculate precisely, here are some guidelines you should consider when deciding whether to enable output caching on a page.

  1. Enable output caching on a page that is frequently accessed and returns the exact same contents for many of those accesses.

    It is useless to cache a page if it is rarely accessed. It wastes memory and incurs more overhead on the few requests that the page gets. Keep this in mind as you begin deciding on which pages in your application to enable output caching. Good candidates for output caching are pages that are accessed frequently and render themselves identically for all or most of those accesses. This is somewhat alleviated by the fact that output-cached pages are stored in the data cache and that pages are evicted from the cache on a "least recently used" basis when memory is constrained.

  2. Cache as few versions of a page as possible.

    This guideline relates to the first one in that it advises you to cache as few versions of a page as possible. If you cache every possible version of a page (assuming it varies with a query string or POST body), you will populate the cache with a large number of page renderings, many of which will probably never be accessed again. Try to anticipate the most common use of your pages (or use site statistics to understand common use), and use the attributes of the OutputCache directive to cache only the most frequently accessed versions of a page.

  3. If a page is accessed frequently, but portions of its contents change with each access, consider separating the static or semistatic portions of the page into output-cached user controls.

    Before deciding that a frequently accessed page is uncacheable because it changes with each request, you should look carefully at the entire contents of the generated page. If any portions remain static from one request to another, especially if those portions are somewhat expensive to render (if they are generated from a database query, for example), you may want to consider using page fragment caching to cache only those portions of the page. By encapsulating portions of the page in one or more user controls, you can then enable output caching on the user controls themselves.

  4. When enabling output caching for a page, be sure not to introduce incorrect behavior and/or rendering for any particular client.

    In addition to controlling which versions of a page are output-cached for efficiency, you want to be very sure that you are not introducing any incorrect behavior when adding an OutputCache directive to a page. For example, suppose you have a page that displays a form with two fields, name and age, and you add an OutputCache directive with the VaryByParam attribute set to 'name'. For each request that comes in with a distinct value for name, you cache a new version of the page. However, if someone posts the same name to your page with a different age, ASP.NET still retrieves the rendered page from the cache, which was rendered with the first value for age that was submitted, resulting in incorrect behavior.

  5. Determine the duration of the cached page carefully to balance speed of access (throughput) with memory consumption and cache coherency correctness.

    When determining the length of the duration for an output-cached page, you have two important considerations. First, the longer the page stays cached, the longer it occupies memory. This is fine if it is being frequently accessed in the cache, but if it is not being accessed, it is simply wasting space. Second, as with any caching mechanism, you need to be careful that the cached version of your page is not out of date with the data used to generate it (often called the cache coherency problem). To avoid this, choose a duration that is short enough to ensure that the underlying data used to generate the page will not change while the page is cached. In some cases, cached pages with stale data may be acceptable, but be sure you are aware that you have made a decision to potentially serve stale pages.

  6. Consider enabling sliding expiration on a page if you end up using VaryByParam='*'.

    One of the easiest ways to enable correct output caching on pages that change with requests is to set VaryByParam to '*'. By doing this, however, you will probably cache many more versions of your page than necessary (in all likelihood, the most commonly accessed renderings of the page will be a small subset of the total set of page renderings). It is advisable, therefore, to enable sliding expiration on a page with VaryByParam set to '*'. This will keep versions of the page that are accessed frequently in the cache, but those that are not accessed frequently will be removed from the cache as soon as their expiration is reached. Keep in mind, however, that enabling sliding expiration on a page can easily lead to cache coherency problems, and thus this scenario may be best avoided altogether.

9.3 Data Caching

Internally, the output cache is built using a sophisticated data caching engine. This data caching engine is available directly to page developers as well through the Cache property of the Page class and should be used in addition to output caching (or instead of it, in some cases) to improve response times.

Caching of data can dramatically improve the performance of an application by reducing database contention and round-trips. The data cache provided by ASP.NET gives you complete control over how data that you place in the cache is handled. At its simplest level, data caching can be used as a way to store and restore values in your application, which is trivial to do using its dictionary interface. The example shown in Listing 9-14 demonstrates the caching of a DataView that has been populated from a database query. The first time this page is accessed, the database is queried, the DataView is populated, and it is then placed in the cache. On subsequent accesses, the DataView will be retrieved from the cache, saving the time required to query the database again.

Listing 9-14 Caching a DataView in the Data Cache
<!� File: DataViewCache.aspx �>
<%@ Page Language="C#" %>
<%@ Import Namespace="System.Data" %>
<%@ Import Namespace="System.Data.SqlClient" %>
<html>
<script runat="server">
protected void Page_Load(Object src, EventArgs e)
{
  // Look in the data cache first
  DataView dv = (DataView)Cache["EmployeesDataView"];
  if (dv == null)  // wasn't there
  {
    SqlConnection conn = new SqlConnection(
         "server=localhost;uid=sa;pwd=;database=Test");
    SqlDataAdapter da =
       new SqlDataAdapter("select * from Employees", conn);
    DataSet ds = new DataSet();
    da.Fill(ds, "Employees");
    dv = ds.Tables["Employees"].DefaultView;
    dv.AllowEdit   = false;
    dv.AllowDelete = false;
    dv.AllowNew    = false;
      // Save employees table in cache
    Cache["EmployeesDataView"] = dv;
    conn.Close();
  }
  else
    Response.Write("<h2>Loaded from data cache!</h2>");
  lb1.DataSource = dv;
  lb1.DataTextField = "Name";
  lb1.DataValueField = "Age";
  DataBind();
}
</script>
<body>
<form runat="server">
<asp:ListBox id="lb1" runat=server />
</form>
</body>
</html>

The data cache exists at the scope of the application and in many ways is identical in functionality to the application state bag (HttpApplicationState), with two important differences. First, anything placed in the data cache is not guaranteed to be there when you attempt to retrieve it again (by default). This means that you should always be prepared for a cache miss by being able to retrieve the data from its original source if the cache returns an empty value, as demonstrated in the previous example. The second difference is that the data cache is not intended as a place to store shared, updateable data. Because the cache lives at the application scope, the potential for concurrent access is high, and in fact, the Cache class uses a multireader, single-writer synchronization object (System.Threading.ReaderWriterLock) to ensure that no more than one thread modifies the cache at a time. This synchronization object, however, is not exposed externally and thus cannot be used by clients to perform their own locking. This is in contrast to the HttpApplicationState class, which provides a pair of methods, Lock() and an UnLock() to have clients perform explicit locking whenever modifications are made to the application state. It is also important to keep in mind that the cache lives at the application scope in a particular instance of the ASP.NET worker process and is not shared between processes or machines. This means that cached data is not intrinsically synchronized across machines in a Web farm.

As a result, the proper and intended use of the data cache is to store read-only data or objects for the convenience of access. Note that in the previous example, the DataView that was cached was modified to prevent updates, deletes, or insertions, effectively making it read-only. It is good practice to make cache entries read-only to ensure that cached data is not accidentally modified. The example in Listing 9-15 shows how not to use the data cache.

Listing 9-15 Improper Use of the Data Cache
<!� File: BadCache.aspx �>
<%@ Import Namespace="System.Collections" %>
<html>
<script language="C#" runat="server">
protected void Page_Load(Object src, EventArgs e)
{
  // Look in the data cache first
  ArrayList al = (ArrayList)Cache["MyList"];
  if (al == null)  // wasn't there
  {
     al = new ArrayList();
      // Save ArrayList in cache
    Cache["MyList"] = al;
  }
  // Manipulate the ArrayList by adding the time this
  // request was made (bad! may be accessed concurrently!)
  al.Add(DateTime.Now.ToString());
  lb1.DataSource = al;
  DataBind();
}
</script>
<body>
<form runat="server">
<asp:ListBox id="lb1" runat=server />
</form>
</body>
</html>

In this example, an instance of the ArrayList class is stored in the cache. It is modified every time the page is hit by adding the time of the current request. This is dangerous because multiple client requests may come in concurrently to this application, and the ArrayList class is not thread-safe by default.

The data cache is also used internally to manage the HTTP pipeline. It is often instructive to view the contents of this data cache, including all system-cached objects and any you may have added to the cache. You easily can do this by calling the function shown in Listing 9-16 from within any ASP.NET page.

Listing 9-16 Displaying the Contents of the Data Cache
private void PrintDataCache()
{
  string strCacheContents;
  string strName;

  //display all of the items stored in the ASP.NET cache
  Response.Write("<b>Data cache contains:</b><br/>");
  Response.Write("<table>");
  Response.Write("<tr><td><b>Key</b></td>");
  Response.Write("<td><b>Value</b></td></tr>");
  foreach(object objItem in Cache)
  {
    Response.Write("<tr><td>");
    DictionaryEntry de = (DictionaryEntry)objItem;
    Response.Write(de.Key.ToString());
    Response.Write("</td><td>");
    Response.Write(de.Value.ToString());
    Response.Write("</td></tr>");
  }
  Response.Write("</table>");
}

9.3.1 Cache Entry Attributes

So far we have seen that the data cache is similar to the application state object except for object lifetime and updateability. There are several other differences as well, primarily related to determining the lifetime of an object in the cache. Each time a new item is inserted into the cache, it is added with a collection of attributes. Every cache entry is represented by an instance of the private CacheEntry class, which is created on behalf of your item when you perform a cache insertion. While you don't have direct access to this class when using the cache, you can control the attributes of each instance when you add objects to the cache. Table 9-4 shows the various properties of the CacheEntry class and their meanings.

Table 9-4. CacheEntry Properties

Property

Type

Description

Key

String

A unique key used to identify this entry in the cache

Dependency

CacheDependency

A dependency this cache entry has梕ither on a file, a directory, or another cache entry梩hat, when changed, should cause this entry to be flushed

Expires

DateTime

A fixed date and time after which this cache entry should be flushed

Sliding Expiration

TimeSpan

The time between when the object was last accessed and when the object should be flushed from the cache

Priority

CacheItemPriority

How important this item is to keep in the cache compared with other cache entries (used when deciding how to remove cache objects during scavenging)

OnRemoveCallback

CacheItem RemovedCallback

A delegate that can be registered with a cache entry for invocation upon removal

When the default indexer of the data cache is used to insert items, as was shown in the previous examples, the values of the CacheEntry class are set to default values. This means that the expiration is set to infinite, the sliding expiration is at 0, the CacheItemPriority is Normal, and the CacheItemRemoveCallback is null. Basically, your object will remain in the cache as long as no scavenging operation occurs (typically because of excessive process memory usage) and you don't explicitly remove it.

If you want more control over the attributes of the CacheEntry created for your cached object, you can use one of several overloaded versions of the Insert() method. The most verbose version of Insert() takes all the CacheEntry properties as parameters (plus the object to be cached) and passes them into the constructor for the CacheEntry class. For example, the code shown in Listing 9-17 inserts a string into the data cache that is set to expire a second before midnight on December 31, 2001.

Listing 9-17 Setting Expiration Dates in the Data Cache
object obj = // retrieve obj to place in cache somehow
DateTime dt = new DateTime(2001, 12, 31, 23, 59, 59);
Cache.Insert("MyVal", // key
             obj,     // object
             null,    // dependencies
             dt,      // absolute expiration
             Cache.NoSlidingExpiration, // sliding exp.
             CacheItemPriority.Default, // priority
             null);   // callback delegate
9.3.1.1 Cache Object Lifetime

Whenever data is added to the data cache, you must specify its lifetime (or implicitly accept the default lifetime of infinite). This is an important decision because it directly affects the correctness of data retrieval in your application, and if not done correctly, can lead to working with stale data, often referred to as cache coherency problems. How you determine the lifetime of the data that you place in the cache depends entirely on the type of data you are caching. The data may become invalid when a file changes on the system or when another cache entry becomes invalid. It may become invalid after a fixed period of time (absolute expiration). Or perhaps the data is not in danger of becoming stale, but you don't want it to occupy memory in the cache unless it is actually being referenced (achieved with sliding expiration times). Finally, you can register a callback delegate for the data cache to invoke whenever a particular item is removed from the cache if you want to take specific action when the item is removed.

All these options can be specified when you insert an item into the cache using the Cache.Insert() method. The code in Listing 9-18 shows an example of adding the contents of a file to the data cache on application start (in the global.asax file). This cache entry becomes invalid if the contents of the file used to populate the cache entry change, so a CacheDependency is added to the file. We also register a callback function to receive notification of when the data is removed from the cache. Finally, this entry is set to have no absolute expiration, no sliding expiration, and the default value for priority.

Listing 9-18 Using Cache Dependencies
<!� File: global.asax �>
<%@ Application Language="C#" %>
<script runat=server>
public void OnRemovePi(string key, object val,
                       CacheItemRemovedReason r)
{
     // Perhaps perform some action in response to
     // cache removal here
}

public void Application_OnStart()
{
  System.IO.StreamReader sr =
     new System.IO.StreamReader(Server.MapPath("pi.txt"));
  string pi = sr.ReadToEnd();

  CacheDependency piDep =
     new CacheDependency(Server.MapPath("pi.txt"));
  Context.Cache.Add("pi", pi, piDep,
                    Cache.NoAbsoluteExpiration,
                    Cache.NoSlidingExpiration,
                    CacheItemPriority.Default,
          new CacheItemRemovedCallback(OnRemovePi));
}
</script>

Any page that was part of this application could then reference the "pi" key in the data cache and be guaranteed that it is always up to date with the contents of the pi.txt file. Listing 9-19 shows how it might be used梚n this case, to populate the contents of a text box with the value of the string in the file.

Listing 9-19 Sample Page Accessing a Cache Element
<!� File: PiPage.aspx �>
<%@ Page language=C# %>
<html>
<head>
<script runat=server>
protected void Page_Load(Object src, EventArgs e)
{
  if (Cache["pi"] == null)
  {
     // Refresh pi in app
     pi.Text =
       ((global_asax)Context.ApplicationInstance).LoadPi();
  }
  else
    pi.Text = (string)Cache["pi"];
}
</script>
</head>

<body>
<form runat="server">
<h1>The pi Page</h1>
<asp:TextBox id="pi" runat=server Rows=50 Wrap=True
             Width=450px TextMode=MultiLine
             Height=300px/>
</form>
</body>
</html>

9.3.2 Cache Object Removal

An object in the data cache can be removed in several ways. You can explicitly remove it from the cache using the Cache.Remove method, it can be removed because its lifetime has expired, or it can be implicitly removed from the cache to reduce memory consumption (scavenging). You have direct control over the first two cases. You explicitly call Cache.Remove, and you explicitly set the expiration date of items in the cache. Removal because of scavenging, however, is not always under your direct control. You can indicate a preference for how your cache items should be treated during a scavenging operation, however.

When scavenging is performed, the data cache removes items with low priority first. By default, your cache items have normal priority. If you want to directly control the priority of your cache items, you can set the priority value when you perform the insertion into the cache. Table 9-5 shows the various values for CacheItemPriority and their meanings. Note that you can request that an item in the cache not be removed during scavenging. Most of the time, it is wise to leave these priority values at their defaults and let the cache use its scavenging algorithms to decide which objects to remove.

Table 9-5. CacheItemPriority Values

CacheItemPriority Value

Description

AboveNormal

Item less likely than Normal items to be removed from cache during scavenging

BelowNormal

Item more likely than Normal items to be removed from cache during scavenging

Default

Equivalent to Normal

High

Least likely to be deleted from the cache during scavenging

Low

Most likely to be deleted from the cache during scavenging

Normal

Deleted from the cache after all Low and BelowNormal items have been deleted during scavenging

NotRemovable

Never removed from the cache implicitly

9.3.3 Data Cache Considerations and Guidelines

As with the output cache, using the data cache effectively involves making important decisions about data lifetime and estimating trade-offs in memory consumption and throughput. The following guidelines and considerations are designed to help you use the data cache as efficiently as possible.

  1. The data cache is not a container for shared updateable state.

    You should always anticipate the possibility that a request for an item in the data cache will return null, and you should never modify existing items (although replacing them with new objects is fine). The data cache does protect against concurrent writes to the Hashtable that is used internally to store cache entries, but that concurrency protection does not extend to accessing and modifying objects in the cache. In general, it is a bad idea to use any shared updateable state at the application scope anyway, because it often can become a bottleneck in application performance.

  2. Cache data that is accessed frequently and is relatively expensive to acquire.

    The effectiveness of caching data in a Web application depends on two factors: how often the data is accessed and how often it changes. As with output caching, if the data changes with each client request, caching it is a complete waste of resources. On the other hand, if the data does not change frequently but is almost never accessed, it is also a waste of resources to cache it (especially if it is big). Caching data retrieved from a database, especially if the database is on a remote machine, is almost always beneficial.

  3. If data is dependent on a file, directory, or other cache entry, use a CacheDependency to be sure it remains current.

    Many cache entries may have dependencies on external resources, or perhaps other cache entries. It is easy to ensure that these entries stay current by adding a CacheDependency when inserting them into the cache. Note that you can also signal that a cache entry is out of sync with a database value by adding a trigger to the database that modifies a file whenever the data is changed.

  4. Beware cache coherency problems.

    With the dramatic performance improvement that data caching brings, it is easy to begin relying on it too much and introducing cache coherency problems into your system. When you add data to the cache, carefully think through the different ways in which it will be accessed, and make sure that the data is not stale or that stale data is acceptable. In most cases, you can still achieve significant performance improvements even with short cache lifetime durations, especially if the data is accessed frequently.

SUMMARY

ASP.NET introduces two significant caching features to improve application performance: output caching and data caching. Output caching provides a mechanism for caching rendered versions of pages so that subsequent access to those pages will not have to go through the entire rendering process. You enable output caching on a page by specifying an OutputCache directive, in which you can control the duration the page should be cached, how many different versions of the page should be cached, and whether the page should be cached on downstream proxies and in client browsers. Output caching is also applicable to user controls, where it is called page fragment caching. Applying the OutputCache directive to an .ascx file caches the rendering of that control the first time it is used on a page.

An application-level data cache is available through the Cache property of the HttpContext class. Any object can be inserted into the data cache, and each entry in the cache has its own set of attributes that control its lifetime. Cache entries can specify how long they should stay in the cache either by specifying a fixed time when they should be removed, a duration after the most recent access after which they should be removed, or a dependency on another cache entry or file that should trigger their removal. Entries in the cache are subject to scavenging according to priority, which gives ASP.NET a last recourse for reclaiming memory before bouncing its worker process.

 

posted on 2006-01-25 10:53  深瞳  阅读(761)  评论(1编辑  收藏  举报