Welcome to CrankyGoblin.Com Sign in | Join | Help

Public Class GeoffAppleby

Inherits Microsoft.VisualBasic.MVP : Implements IBrainFart
Serialization and Scalability

I learned an important lesson this week. Sometimes a good way of doing things has other penalties :)

In the current application we're writing, even taken over a bit of the serialization that would ordinarily be done by ASP.net when talking to/from a webservice. Our data types are complex, and hoping that ASP.net serializes an object properly when return from a webmethod was causing some problems.

As a result, I thought I'd be a little tricky and invoke the serializer myself. Note here that we're not overriding the serialization (yet), only invoking the serializer explicitly, which lets us ensure that all data types that the serializer comes across are handled. To do so, I create an XMLSerializer using this signature:

Public Sub New(ByVal type as Type, ByVal extraTypes() as Type)

The trick here is that I was able to fill the extraTypes parameter, which you can't do when ASP.net does it for you.

This does cause a bit of ugliness, however. Every webmethod that returns one of our custom classes returns a string only, and it's the responsibility of the caller to deserialize it back. This is OK, because we whipped up a little wrapper class for the client end that takes care of it all for us.

It's not following any patterns that I know of, but it was the only way at the time I could get past all the exceptions being thrown when ASP.net was serializing for me. I'm older and wiser now, but I don't have time to go back and rewrite it quite yet.

Anyway, what this did do is raise a big scalability issue (as  i discovered this week). In any given web method, what we'd see is something like this:

  <WebMethod()> _

  Public Function SomeWebMethod() As String

    Dim someObject As MyCustomType

    somobject = DoSomeSortOfWork()

 

    Dim oSerializer As New Xml.Serialization.XmlSerializer(MyCustomType, ArrayOfOtherTypes())

    Return oSerializer.Serialize(someObject)

  End Function

This is obviously not valid code, but it gets the point across. The thing to point out is that given the stateless nature of web services, every time I needed a serializer for a given webmethod, I had to instantiate it myself.

But what happens when you load test and invoke 20 threads simultaneously to all call the same webmethod that involves serialization of a custom type?

The simple answer is: a hell of a lot! As we were load testing, the more simultaneous connections we specified the longer it would take - but almost exponentially longer.

It ends up that the XMLSerializer does some tricky things inside it's constructor. The actual serialize method is quite fast, and performs well, but the constructor is a nasty piece of work - nasty, in the sense that unless you know what it's doing, you can't prepare for what's going to happen.

The XMLSerializer constructor works out how to serialize an object of the type it's been told about. Since it can't possibly know before hand, it's at this time that the type is analyzed. But it goes one step further than that - to make serialization itself very fast (I presume) instead of setting up some internal data structures so it knows later what to do, it generates a custom temporary assembly that is specifically written to do the serialization of that specific type. This means that an in-memory compile needs to take place, and all manner of other things.

So when 20 simultaneous threads all call the constructor to an XMLSerializer, the CPU hits 100% usage for a nice while as it performs 20 compiles. Not very scalable huh?

Our solution for now, until we have a chance to review it properly and come up with a good solution to the whole mess is to avoid the problem in the first place. The obvious problem with the performance is the constructor for the serializer, so we need to jump past it. So we've written a pool (much like the connection pool for database connections) that allows serializer re-use. A set of XMLSerializers is created in the ApplicationStart event, and they are handed out for use as required when a webmethod call is made. The end result is a bit of load on the server when the application starts - and now when I hit it with 40 simultaneous threads, my CPU usage never goes above 12%. And it's faster. Much much faster. But the important part is that the processor isn't killed anymore - and to me, that's the important part of scalability.

Good enough for now :)

Listening to: pretty noose - soundgarden - (4:13)
Posted: Friday, January 14, 2005 7:13 AM by Geoff Appleby

Comments

TrackBack said:

# March 1, 2005 3:13 PM

TrackBack said:

# March 6, 2005 3:49 PM

JosephCooney said:

Hi Geoff - When you serialize something with the Xml Serializer it creates a specialized xml reader (for deserializing your type) and writer (for serializing your type). This is how the web services stack in .NET manages to out-perform most other web service stacks (which don't use this type code-gen approach). However, some overloads of the Xml Serializer (including the one in your example pseudo code) cause a new reader/writer pair to be generated each time the constructor is called. This has two effects - firstly it makes your code slower since it has to re-generate the code each time. This code gets written to disc, compiled, dynamically loaded into your current app-domain, JIT compiled and then run and deleted from disc. The second effect is that because you're constantly loading new dynamically generated types all the time the amount of memory taken up by your app domain will grow and grow. This is not memory taken up by allocated objects, but memory taken up by dynamically loaded types. There is no way to unload a dynamically loaded assembly from within the same app-domain. This memory can't be freed by garbage collection and will stay around until your appdomain is unloaded (in the case of a server app when your machine crashes because it's run out of memory). I wrote about this a while ago here: http://jcooney.net/archive/2003/10/21/430.aspx but I'll probably update it, since the example I used causes a temporary serialization assembly to be generated each time under framework 1.0 but NOT under framework 1.1 (making it a fairly sucky example now).
# March 7, 2005 7:57 AM

JosephCooney said:

Hi Geoff - I should have read your post a little more clearly and I would have seen that you seem to mostly know this stuff already ;-), however the way the serializer works it doesn't ALWAYS make a temporary assembly, just for the overload you're using.
# March 7, 2005 4:57 PM

Geoff Appleby said:

Hey Joseph :) Yeah, I didn't want to say anything....

Actually, the biggest reason for not saying 'duh, i know!' is quite simply that Others who read the post likely don't - and hopefully they read the comments too. When it comes to stuff like this, I can say what I've noticed, and what I assume, and all that stuff, but having someone reiterate it (in whatever form) may be clearer for people who can't follow my (often derailed) train of thought.

Keep it up dude - as you said, I mostly know it, but there was still some gems in there that I didn't, and I always like to learn. If nothing else, I know that next time I have trouble with xmlserialozation that I can't figure out, I ask you :)
# March 7, 2005 5:19 PM
Leave a Comment

(required) 

(required) 

(optional)

(required) 

To submit your comment, click on these pictures:
  • Angry Geoff
  • Geoff's tongue
  • Teenage Mutant Ninja Geoffy!
Gaptcha Image - No Peeking! Gaptcha Image - No Peeking! Gaptcha Image - No Peeking!
Gaptcha Image - No Peeking! Gaptcha Image - No Peeking! Gaptcha Image - No Peeking!
Gaptcha Image - No Peeking! Gaptcha Image - No Peeking! Gaptcha Image - No Peeking!
Can't recognise the people in these pictures? Look here for a quick introduction.
There's a time limit for you to get your comment submitted before this set of pictures expires. If you think it's been longer than 10 minutes, get some new pictures first (you won't lose what you've typed so far).
Get some new pictures 

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS