Tuesday, June 21, 2011

The Role of ViewState in WebForms Applications

In the last couple of years I've been working extensively with the .NET MVC framework. Before that, all my .NET coding was in Webforms. I never liked Webforms, and I immediately liked MVC. Now seems like a good moment to reflect on some of the features of each and evaluate why that is.

In this article I want to touch on what the Webforms ViewState is. I'll talk about how it works, and some of the issues it raises.

The Framework is Editing My HTML
Back when I worked with Webforms, I worked in an agency in which W3C-compliant HTML standards and accessibility were key selling points. So one of the main things that I disliked about Webforms was that it seemed so intent on interfering with my frontend code. Particularly in .NET 1.1 the WebControls spat out such untidy, non-standards-compliant HTML.

Microsoft revised this and the HTML became standards-compliant for .NET 2. But I still didn't like the way it wrote HTML for me, as I was generally handed a HTML template that I was to stick to religiously for accessibility reasons. We found workarounds such as doing all WebControl work in the codebehind, but as we shall see, this approach has problems of it's own.

On a more personal note, I was never happy either with the big, ugly _VIEWSTATE string in the source. The point for me was, the framework shouldn't be editing my HTML, that isn't it's concern. It's concern is helping me generate whatever HTML I want to generate for each request.

I'll touch more on some of the problems at the bottom of this article, but for now I want to discuss mechanics. What is the Webforms Viewstate for? And how does it work?

Duality of a Webforms Page
All Webforms developers know that there is a 1:1 relationship between a requested resource (a URL) and a processing resource (a Webforms 'page'). Each 'page' is composed of two elements - a codebehind (derived from System.Web.UI.Page), and an ASPX page containing server-side code (such as WebControls).

But although there are two elements for us as a developer, there is only one element as far as the .NET engine is concerned. Each time you edit an ASPX page it is converted into an auto-generated class which logically compliments the codebehind. As such you can make the same declarations in the codebehind or the ASPX page. The ASPX page is just a neat abstraction useful for layout and initialisation purposes, and friendly to people used to frontend HTML.

Any WebControls declared in the ASPX page are converted into their codebehind equivalents and compiled into the DLLs like everything else. Attributes of the WebControls are converted into value assignment operations (and conversions occur if the member variables are not typed as strings). The HTML portions of the ASPX page are converted into string literals, so the idea that you are editing a 'flat page' is illusory.

Request-Handling in Webforms
When a page is requested from IIS over HTTP, the request is handed to the ASP.NET engine (aspnet_isapi.dll). Internally the request passes through several HTTP modules and arrives at the HTTPHandler (below), invoking the ProcessRequest() method. This method kicks off the following sequence:

Webforms request handling pipeline
The Webforms HTTP request handling pipeline
Notice the clear distinction between PostbackData and ViewState. Note that the loading of both only occurs on PostBack.

PostbackData vs. ViewState
PostbackData is a collection representing all form-field data from the HTTP POST header (all but one - the hidden _VIEWSTATE field). If you are using standard-issue WebControls such as the TextBox class, then your controls will have rendered the HTML form elements that generated this PostbackData. Typically there is one PostbackData item in the collection per rendered WebControl, and this item will correspond with a designated property of the WebControl. In the case of TextBox, the property TextBox.Text corresponds to the value attribute of a text input field.

<asp:TextBox Text="somevalue" runat="server" />
<input type="text" value="somevalue" ... />

ViewState is a collection representing the dynamically assigned properties of each WebControl other than those designated to correspond with PostbackData.

I'll explain that sentence in more detail. Which properties of a WebControl is ViewState interested in storing? The answer is all properties except those associated with it's value. Value-related properties are covered by PostbackData because they are already included in the HTTP POST specification. Microsoft wanted some way to persist properties over HTTP requests which are not covered by the HTTP POST specification, and so they invented ViewState.

But there's more. If every property of every WebControl were to be serialised and deserialised on every request there would be a huge performance hit. As this article aptly points out, only WebControl properties which have been dynamically assigned during execution (i.e. marked as 'dirty', i.e. changed since initialisation with a default) are included in the ViewState.

<asp:TextBox ID="myTextBox" CssClass="class1" .. />

In other words, if you declare a CssClass attribute in your ASPX page, the class which is auto-generated from it will mirror that attribute into an assignment. This assignment will take place during the 'Initialization' phase of the pipeline, which if you check the diagram is the first thing that is done in response to a request.

Now, let's assume that you don't make any changes during the Load phase (i.e. in Page_Load). When the _VIEWSTATE string is generated, the assigned value will not be included because it has not been marked as dirty. Several page requests could occur and the value would not need to be persisted because each time the TextBox is initialised the CssClass will gain it's default value.

myTextBox.CssClass = "class2";

Now, perhaps during your Page_Load event, inside some conditional statement you assign the CssClass property a new value of "class2". Now when execution reaches the SaveViewState part of the pipeline, it will discover that the CssClass property has been marked as dirty. It's name and value will be serialised into the _VIEWSTATE string, and when the postback occurs, LoadViewState will 'remember' the change.

It should start to become clear now why it's a problem doing all your WebControl work in the codebehind. All of your values will always be dynamically assigned, and are therefore all persisted via ViewState. This increases page size, and reduces performance considerably.

One ViewState Per Control
In case you're interested in how the memory state maps to the serialisation process, it's worth noting how every WebControl has it's own ViewState collection. I'll explain.

When the ASPX-generated class is created, each control is placed in a hierarchy mirroring the hierarchy defined in the ASPX page. This is usually a top-level Page object, containing 3 x second-level objects (a string literal containing some HTML, a Form object and another string literal with the rest of the HTML). The Form object then contains a series of WebControls, each of which may contain their own children and so on.

In order to serialise the _VIEWSTATE string, this hierarchy is traversed so that the resulting encoded string mirrors the hierarchy. What you see in the encoded _VIEWSTATE is the result - if you look at the middle of the encoded string you are probably looking at properties of a low-level child control. If you look at either edge you are probably looking at higher-level controls.

Making Things UnRESTful
So the ViewState isn't about remembering form field values - that's the job of PostbackData. ViewState is actually about persisting the state of controls over several PostBack requests. It's as though Microsoft said: "Ok, you handle the persistence of your data between HTTP requests, and we'll handle the persistence of the controls. You set the properties of those controls as and when you like, we'll make sure they stay set."

Which is kind of cool in a way, after all Webforms runs on an event model. If Microsoft didn't do this, you as a developer would have to include a procedure during all of your event handing functions to check and ensure all of the WebControls were correctly set up based on the current memory state of your application. It would be a performance hit, and a pain to create, set up and test.

But looking at it from another angle, it's forcing a square peg into a round hole. It's forcing an Event-driven model onto a RESTful architecture. The web, and HTTP are all based on the idea of REST. If, three PostBack's into a session, a user copies a URL and sends it to a friend, it is very unlikely that the friend will see the same page that was sent.

It is true that even if you use a RESTful programming architecture (such as Microsoft MVC) you may get the same problem, but it is much easier to code meaningful state-based views in RESTful architectures. Only the value data is lost when the URL copied, and generally that is the part that you don't intend to send anyway when you copy a URL.

I'll go more deeply into these differences in another post, but for now it's enough to point out the mismatch between an event-driven model and REST, and to think of ViewState as a kind of compromise between the two.

Further Reading
These articles made for great reading in helping to assemble this post:

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.