Serialization and Scalability
I learned an important lesson this week. Sometimes a good way of doing things has other penalties :)
In the current application we're writing, even taken over a bit of the serialization that would ordinarily be done by ASP.net when talking to/from a webservice. Our data types are complex, and hoping that ASP.net serializes an object properly when return from a webmethod was causing some problems.
As a result, I thought I'd be a little tricky and invoke the serializer myself. Note here that we're not overriding the serialization (yet), only invoking the serializer explicitly, which lets us ensure that all data types that the serializer comes across are handled. To do so, I create an XMLSerializer using this signature:
Public Sub New(ByVal type as Type, ByVal extraTypes() as Type)
The trick here is that I was able to fill the extraTypes parameter, which you can't do when ASP.net does it for you.
This does cause a bit of ugliness, however. Every webmethod that returns one of our custom classes returns a string only, and it's the responsibility of the caller to deserialize it back. This is OK, because we whipped up a little wrapper class for the client end that takes care of it all for us.
It's not following any patterns that I know of, but it was the only way at the time I could get past all the exceptions being thrown when ASP.net was serializing for me. I'm older and wiser now, but I don't have time to go back and rewrite it quite yet.
Anyway, what this did do is raise a big scalability issue (as i discovered this week). In any given web method, what we'd see is something like this:
<WebMethod()> _
Public Function SomeWebMethod() As String
Dim someObject As MyCustomType
somobject = DoSomeSortOfWork()
Dim oSerializer As New Xml.Serialization.XmlSerializer(MyCustomType, ArrayOfOtherTypes())
Return oSerializer.Serialize(someObject)
End Function
This is obviously not valid code, but it gets the point across. The thing to point out is that given the stateless nature of web services, every time I needed a serializer for a given webmethod, I had to instantiate it myself.
But what happens when you load test and invoke 20 threads simultaneously to all call the same webmethod that involves serialization of a custom type?
The simple answer is: a hell of a lot! As we were load testing, the more simultaneous connections we specified the longer it would take - but almost exponentially longer.
It ends up that the XMLSerializer does some tricky things inside it's constructor. The actual serialize method is quite fast, and performs well, but the constructor is a nasty piece of work - nasty, in the sense that unless you know what it's doing, you can't prepare for what's going to happen.
The XMLSerializer constructor works out how to serialize an object of the type it's been told about. Since it can't possibly know before hand, it's at this time that the type is analyzed. But it goes one step further than that - to make serialization itself very fast (I presume) instead of setting up some internal data structures so it knows later what to do, it generates a custom temporary assembly that is specifically written to do the serialization of that specific type. This means that an in-memory compile needs to take place, and all manner of other things.
So when 20 simultaneous threads all call the constructor to an XMLSerializer, the CPU hits 100% usage for a nice while as it performs 20 compiles. Not very scalable huh?
Our solution for now, until we have a chance to review it properly and come up with a good solution to the whole mess is to avoid the problem in the first place. The obvious problem with the performance is the constructor for the serializer, so we need to jump past it. So we've written a pool (much like the connection pool for database connections) that allows serializer re-use. A set of XMLSerializers is created in the ApplicationStart event, and they are handed out for use as required when a webmethod call is made. The end result is a bit of load on the server when the application starts - and now when I hit it with 40 simultaneous threads, my CPU usage never goes above 12%. And it's faster. Much much faster. But the important part is that the processor isn't killed anymore - and to me, that's the important part of scalability.
Good enough for now :)
Listening to: pretty noose - soundgarden - (4:13)