Cấp bậc tác giả:

DOTNET

ASP.NET SEO around VIEWSTATE

Được viết bởi QuangIT ngày 14/09/2012 lúc 04:48 PM
This article will bring you several SEO policies in terms of ViewState in ASP.NET.
  • 0
  • 11096

ASP.NET SEO around VIEWSTATE

This article will bring you several SEO policies in terms of ViewState in ASP.NET.

Introduction

According to Wikipedia, search engine optimization (SEO) is the process of improving the visibility of a website or a web page in search engines via the "natural" or un-paid ("organic" or "algorithmic") search results. SEO can be divided into two categories, in-site SEO and out-site SEO. Our main interest, in this article, lies in the former. The so-called in-site SEO is to use a normal, reasonable and adaptive approach to search engine to optimize the website. As far as the ASP.NET Webform based websites are concerned, one of the difficulties to tackle is around ViewState. This article will bring you several SEO policies in terms of ViewState.

NOTE

The sample test environment in this article:

1. Windows 7;

2. .NET 4.0;

3. Visual Studio 2010;

4. Microsoft sample database Northwind.


What's ViewState

ViewState is a special structure closely relevant to ASP.NET Web Forms pages. With this structure, ASP.NET Web Forms are capable of maintaining their own state across multiple client roundtrips. The page-level state is known as the view state of the page. In Web Forms pages, their view state is sent by the server as a hidden variable in a form, as part of every response to the client, and is returned to the server by the client as part of a postback.

Figure 1: Roundtrips between the client side and server side around ViewState

Roundtrips between the client side and server side around ViewState

Listing 1 indicates the initial client-side state (we've added nothing on the page) for the file Default.aspx under the ASP.NET 4.0 case.

<form method="post" action="Default.aspx" id="ctl01"> <div class="aspNetHidden"> <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTY1NDU2MTA1MmRkDad1UepYULRr7aGn3H03DwzCwYDCLGmNirXqIdu3jyg=" /> </div> ...other code </form>


A well-known fact is many search engines have a clear limit crawled pages - only the first thousands of text content will be snapped, and when the first 2KB of your page's ViewState is mainly garbage your page is likely to be subjected to punishment. Under most real scenarios, the ASP.NET generated ViewState spam often exceeds more than 20KB, which will seriously affect the search engine including your Web pages. Hence, ViewState is peculiar thing that makes people happy and, at the same time, makes them worry.

To make clearer the ViewState concept, there is a good helper class in here helping you to find what is actually encoded in the __VIEWSTATE hidden variable on your ASP.NET Web Forms.

Till now, some new ASP.NET developers may come to an idea that since ViewState can seriously slow down the band traffic and are so SEO unfriendly, why cannot we easily disable passing ViewState related data to and from from the three following levels?

1. In the machine level through the file machine.config, set:

<pages enableViewStateMac='false'/>
2. Or in the application level, set, in the file Web.config:
<pages enableViewStateMac='false'/>
3. Or in the single-page level, set in the current page:
<%@Page enableViewStateMac='false'%>
Or through the code way:
Page.EnableViewStateMac=false;

Apparently, the above actions are all too arbitrary. If we disable the ViewState data transfer so as to get rid of the related side effect, it may cause some excellent ASP.NET features unavailable on the page. Then, are there any compromising or better ways to deal with ViewState? The answer is YES.

Policy 1 – Not Generating ViewState

The first available way is to prohibit from generating the cumbrous string type=hidden name ="__VIEWSTATE" on the target page. However, this method can be used only when the page does not use ViewState. The following gives the typical steps.

First, create a custom HttpModule:

namespace HiddenInputHttpModuleLib { public class HiddenInputHttpModule : IHttpModule { private string _rawPath = ""; public string GetPath() { return _rawPath; } public void Init(HttpApplication context) { context.ReleaseRequestState += new EventHandler(Context_ReleaseRequestState); } void Context_ReleaseRequestState(object sender, EventArgs e) { HttpResponse response = HttpContext.Current.Response; if (response.ContentType == "text/html") { HiddenInputFilter filter = new HiddenInputFilter(response.Filter); response.Filter = filter; } } public void Dispose() { } } public class HiddenInputFilter : Stream { Stream responseStream; long position; StringBuilder responseHtml; public HiddenInputFilter(Stream inputStream) { responseStream = inputStream; responseHtml = new StringBuilder(); } public override bool CanRead { get { return true; } } public override bool CanSeek { get { return true; } } public override bool CanWrite { get { return true; } } public override void Close() { responseStream.Close(); } public override void Flush() { responseStream.Flush(); } public override long Length { get { return 0; } } public override long Position { get { return position; } set { position = value; } } public override long Seek(long offset, SeekOrigin origin) { return responseStream.Seek(offset, origin); } public override void SetLength(long length) { responseStream.SetLength(length); } public override int Read(byte[] buffer, int offset, int count) { return responseStream.Read(buffer, offset, count); } public override void Write(byte[] buffer, int offset, int count) { string finalHtml = System.Text.UTF8Encoding.Default.GetString(buffer, offset, count); // locate <input type="hidden" name="__VIEWSTATE" string pattern = @"<input type=""hidden"" name=""__VIEWSTATE""[^>]*>"; finalHtml = Regex.Replace(finalHtml, pattern, string.Empty, RegexOptions.IgnoreCase | RegexOptions.Multiline); // Write the response byte[] data = System.Text.UTF8Encoding.Default.GetBytes(finalHtml); responseStream.Write(data, 0, data.Length); } } }


Although lengthy, the idea is apparent: when the target page is loaded we, with the help of the custom HttpModule, check the generated HTML code to filter the__VIEWSTATE stuff and get rid of it, and finally render the filtered HTML markups.

To use the above approach is simple. First, compile the above module into a .dll assemble. Then, reference this assembly in the target Web project. To do this, we need to modify the file Web.config, adding the following content in the System.Web configuration section.

<httpModules>
  <add name="HiddenInputFilter" type=" HiddenInputHttpModuleLib.HiddenInputHttpModule, HiddenInputHttpModuleLib "/>
</httpModules>

Note to directly put the preceding HttpModule filer into use will result in all ASP.NET pages generating no __VIEWSTATE. Such a result is contrary to our initial wish. To leverage this approach in real cases you'd better research deeper into how to use multiple web.config files in ASP.NET environments.

Policy 2 - Persist ViewState on the Server Side

If you have a cool ASP.NET page, with a lot of gadgets on it showing a wealth of changes in response to user actions, then your ViewState size may come to or exceed 200KB. One of the approaches to defeat ViewState is to persist ViewState on the server side. In this way, the ViewState will not consume network bandwidth, so that the related access only occupies the server-side disk read time. To decrease the disk reading time associated with ViewState, we can also fall back upon the cache technique.

To achieve the above target, we can override the two methods of the class Page, as follows:

protected override object LoadPageStateFromPersistenceMedium()
{
    string viewStateID = (string)((Pair)base.LoadPageStateFromPersistenceMedium()).Second;
    string stateStr = (string)Cache[viewStateID];
    if (stateStr == null)
    {
        string fn = Path.Combine(this.Request.PhysicalApplicationPath, @"App_Data/ViewState/" + viewStateID);
        stateStr = File.ReadAllText(fn);
    }
    return new ObjectStateFormatter().Deserialize(stateStr);
}
protected override void SavePageStateToPersistenceMedium(object state)
{
    string value = new ObjectStateFormatter().Serialize(state);
    string viewStateID = (DateTime.Now.Ticks + (long)this.GetHashCode()).ToString();
    string fn = Path.Combine(this.Request.PhysicalApplicationPath, @"App_Data/ViewState/" + viewStateID);
    //ThreadPool.QueueUserWorkItem(File.WriteAllText(fn, value));
    File.WriteAllText(fn, value); 
    Cache.Insert(viewStateID, value);
    base.SavePageStateToPersistenceMedium(viewStateID);
}

First, this code can be placed towards the specified page, or in the parent class of the page. Second, the above policy has no relation with Session. Because ViewState is saved on the disk on the server side, so even if the server is restarted, the client-side page state will not lose.

BTW, to leverage the above solution, you should first generate the path "/App_Data/ViewState". A recommendable place to do this is the Application_Start method of the file Global.asax.cs.

void Application_Start(object sender, EventArgs e)
{
    var dir = new DirectoryInfo(this.Server.MapPath("~/App_Data/ViewState/"));
    if (!dir.Exists)
        dir.Create();
    else
    {
        var nt = DateTime.Now.AddHours(-1);
        FileInfo[] info = dir.GetFiles();
        Array.ForEach<FileInfo>(info,f=>{
            if (f.CreationTime < nt)
                f.Delete();
        });
    }
}

This ensures the system absolutely stable and reliable.

Let's look at the running-time snapshots. Figure 2 illustrates the initial screenshot.

Figure 2: The initial screenshot showing data in the table authors (using the sample database Northwind)

The initial screenshot showing data in the table authors (using the sample database Northwind)

The related markups in the file ViewStateOnServer.aspx looks like the following:

<asp:GridView ID="GridView1" runat="server" AllowPaging="True" 
    AutoGenerateColumns="False" DataKeyNames="au_id" DataSourceID="SqlDataSource1">
    <Columns>
        <asp:BoundField DataField="au_id" HeaderText="au_id" ReadOnly="True" 
            SortExpression="au_id" />
        <asp:BoundField DataField="au_lname" HeaderText="au_lname" 
            SortExpression="au_lname" />
        <asp:BoundField DataField="au_fname" HeaderText="au_fname" 
            SortExpression="au_fname" />
        <asp:BoundField DataField="phone" HeaderText="phone" SortExpression="phone" />
        <asp:BoundField DataField="address" HeaderText="address" 
            SortExpression="address" />
        <asp:BoundField DataField="city" HeaderText="city" SortExpression="city" />
        <asp:BoundField DataField="state" HeaderText="state" SortExpression="state" />
        <asp:BoundField DataField="zip" HeaderText="zip" SortExpression="zip" />
        <asp:CheckBoxField DataField="contract" HeaderText="contract" 
            SortExpression="contract" />
    </Columns>
</asp:GridView>
<asp:SqlDataSource ID="SqlDataSource1" runat="server" 
    ConnectionString="<%$ ConnectionStrings:ConnectionString %>" 
    SelectCommand="SELECT * FROM [authors]"></asp:SqlDataSource>

Since the above code is only the elementary GridView + SqlDataSource related implementation, it is no use making more explanation any more. The following shows the initial HTML source code rendered on the client side.

Figure 3: The initial HTML source code with ViewState enabled

The initial HTML source code with ViewState enabled

Next, let's continue to look at the screenshot with the ViewState removed, with the help of the above two overridden methods –LoadPageStateFromPersistenceMediumand SavePageStateToPersistenceMedium.

Figure 4: The overstaffed ViewState disappears in the client-side HTML code (not all removed)

The overstaffed ViewState disappears in the client-side HTML code (not all removed)

How do you think of the above solution? On the surface, the policy in this section is pretty good; in fact, there are many controversial aspects in this solution. But the deeper researching is also beyond the range of this article; cute readers can continue with the exploration.

In the next section, we'll examine another possible way to lessen the side effect of ViewState upon the system performance.

Policy 3 - Load ViewState as Late as Possible

This idea is easy to follow. Since the ViewState commonly leads to a large size of HTML stuff, we can load it as late as possible while don't let it blight upon the system performance. So we can place it close to the generated </form> tag. To do this, first try to locate the __VIEWSTATE and then move it. According to ASP.NET Web form lifecycle, the appropriate time to accomplish this is in the method Render.

private static readonly Regex viewStateRegex = new Regex(@"(<input type=""hidden"" name=""__VIEWSTATE"" id=""__VIEWSTATE"" value=""/w(.*)"" />)", RegexOptions.Multiline | RegexOptions.Compiled);
//the </form> tag related regular expression
private static readonly Regex endFormRegex = new Regex(@"</form>", RegexOptions.Multiline | RegexOptions.Compiled);
protected override void Render(HtmlTextWriter writer)
{
    System.IO.StringWriter stringWriter = new System.IO.StringWriter();
    HtmlTextWriter htmlWriter = new HtmlTextWriter(stringWriter);
    base.Render(htmlWriter);
    string html = stringWriter.ToString();
    Match viewStateMatch = viewStateRegex.Match(html);
    string viewStateString = viewStateMatch.Captures[0].Value;
    html = html.Remove(viewStateMatch.Index, viewStateMatch.Length);
    Match endFormMath = endFormRegex.Match(html, viewStateMatch.Index);
    html = html.Insert(endFormMath.Index, viewStateString);
    writer.Write(html);
}

The above idea is also easy to understand. First, define the regular expression corresponding to the ViewState related HTML tags. Next, override the method Render of the current Web page. In detail, first call the method Render of the base class to render normal HTML tags, and then locate the ViewState related HTML tags string and change its position to the bottom.

The following gives an example of moving ViewState at the end of the client-side HTML stuff.

Figure 5: ViewState moved to the end of the HTML content (Default.aspx)

ViewState moved to the end of the HTML content (Default.aspx)

A last word is, till now many famous websites have taken the above approach as one of the policies against ViewState.

Better Control over ViewState in ASP.NET 4.0

ASP.NET 4 introduces a bunch of new runtime features, such as the new MetaKeywords and MetaDescription properties of the class Page, improved URL routing support, and new Response.RedirectPermanent() method, with which your website will get optimized for SEO. However, all these new features will be delved into in another article – our main interest herein will only dwell upon the bettered ViewState related stuff in ASP.NET 4.0.

Although ASP.NET 2.0/3.5 provides support for the EnableViewState property, we cannot disable ViewState at a Page level and, at the same time, enable it for individual controls on that page that require it. By introducing a new ViewStateMode property, ASP.NET 4.0 gives more flexible control over that.

Let's next construct related examples and observe the differences.

ViewState in ASP.NET 2.0/3.5

Use Visual Studio 2010 to open the sample website AspNetSeoViewState, and then add an ASP.NET Web Form page named AspNet35ViewStatePage.aspx. Now, drag two Label controls and one Button control onto the page. Finally, disable view state on the Page level by setting the EnableViewState property to false.

<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="AspNet35ViewStatePage.aspx.cs" 
Inherits="AspNetSeoViewState.AspNet35ViewStatePage"
EnableViewState="false" %>
For the second Label control, set its EnableViewState property to false (the default value associated with all related controls is true).
<body>
    <form id="form1" runat="server">
        <div>
            <asp:Label ID="Label1" runat="server" ForeColor="Lime" 
                Text="Label1"></asp:Label>
            <br />
            <asp:Label ID="Label2" runat="server" EnableViewState="False" ForeColor="Red" 
                Text="Label2"></asp:Label>
            <hr />
            <asp:Button ID="btnPostBack" runat="server" onclick="btnPostBack_Click" Text="Start PostBack" />
        </div>
</body>
In the behind-code file, fill in the following code:
protected void Page_Load(object sender, EventArgs e)
{
    if (!IsPostBack)
    {
        Label1.Text = "Label1 Changed";
        Label2.Text = "Label2 Changed";
    }
}

Now, let's look at the running-time snapshots. Right click the file AspNet35ViewStatePage.aspx and hit the item "View in Browser". Figure 6 shows the initial screenshot.

Figure 6: The initial screenshot

The initial screenshot

Apparently, the Page_Load event handler gets triggered and the preceding two statements are executed only one time.

Next, click the button "Start PostBack", and the subsequent screenshot appears, as shown in Figure 7.

Figure 7: The screenshot after clicking the button “Start PostBack”

The screenshot after clicking the button “Start PostBack”

As shown in the above figure, although ViewState is disabled for the entire page and Label2 should save ViewState and retain its new value 'Label2 Changed' after postback since ViewState is explicitly enabled on it. The fact is although hitting the button causes a postback, Label2 does not retain the value ('Label2 Changed') explicitly set on it. This is not as we expected!

New support towards ViewState in ASP.NET 4.0

Now, add another new Web Form page named AspNe4ViewStatePage.aspx. The only difference is that here we will use the new ViewStateMode property. According to MSDN, the value of the ViewStateMode property may be assigned with the three values: EnabledDisabled, and Inherit.

  • Enabled - enables view state for that control and any child controls that are set to 'Inherit' or that have nothing set.
  • Disabled - disables view state.
  • Inherit - specifies that the control uses the ViewStateMode setting from the parent control.

Now, let's look at the related markup in the new file AspNe4ViewStatePage.aspx:

<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="AspNe4ViewStatePage.aspx.cs" 
Inherits="AspNetSeoViewState.AspNe4ViewStatePage"
ViewStateMode="Disabled"  %>
<! - others omitted -->
</head>
<body>
    <form id="form1" runat="server">
        <div>
            <asp:Label ID="Label1" runat="server" ForeColor="Lime" 
                Text="Label1"></asp:Label>
            <br />
            <asp:Label ID="Label2" runat="server" ForeColor="Red" 
                Text="Label2" ViewStateMode="Enabled"></asp:Label>
            <hr />
            <asp:Button ID="btnPostBack" runat="server" onclick="btnPostBack_Click" 
                Text="Start PostBack" />
        </div>
    </form>
</body>
</html>

Note that ViewStateMode on the Page is set to Disabled. Since the child Label2 control has ViewStateMode set to Enabled, Label2 should save view state. On the other hand, since the ViewStateMode property is not set on Label1, it will inherit this property value from its parent (Page) and therefore will persist no view state.

Now, let's look at the running-time snapshots. Right click the file AspNe4ViewStatePage.aspx and hit the item "View in Browser". You will find the initial screenshot looks quite like the previous Figure 6. Next, click the button "Start PostBack", and you will find the subsequent screenshot appear, as shown in Figure 8.

Figure 8: The screenshot after clicking the button “Start PostBack”

The screenshot after clicking the button “Start PostBack”

You will note that when the page first loads, both the Label controls display the text as set in the code-behind file. However, after hitting the button and causing a postback, the control Label2 does retain its value as we expect.

On the whole, in ASP.NET 4.0 we can disable ViewState on a parent level while enable it only for those child controls that require it. Obviously, this will help to improve performance with lesser efforts and gives us more control.

Summary

For any public websites, search engine optimization (SEO) is drastically important. Because most of the network traffic comes from the website search engine, improving your website ranking in search engines can not only increase your site's traffic but also increase your revenue directly or indirectly through your website. In this article, we've only examined parts of the ViewState related SEO techniques in the ASP.NET Webforms environment. As for which one is better, different people have different views. All waits to be tested in practice.


Nguồn bài viết: Dngaz.com

BÌNH LUẬN BÀI VIẾT

Bài viết mới nhất

LIKE BOX

Bài viết được xem nhiều nhất

HỌC HTML