A Little Enum Tip (the second)
A few weeks ago I wrote a post titled 'A Little Enum Tip'.
In it I discussed mapping some random Integer value into an Enum, after validating that value was an actual member of the Enum in question.
I finished with a nice generic code sample that could handle any enum, including values which map to multiple values in the case of Flags based enums.
I also finished by mentioning that there was a few bugs still in there, asking for volunteers to point them out. There was some discussion on it, most of it fairly obvious.
The worst one that I'd missed (as opposed to left out) was the fact that when I was generating the bitset representing all values defined within a flags Enum, I was using + instead of Or. It's an easy mistake to make, but quite a crucial one. Given that it's so easy to let slip through, I thought I'd give a bit of an explanation on why this was so bad to do - forgive me if I'm preaching to the choir, but I'm sure there's someone out there who might actually find this information useful :)
When defining values for a flags enum, what you're really doing is stating allowable bit masks. The most common way of defining a flags Enum is to use 'powers of two' numbers, either in decimal or hex.
<Flags()> _
Public Enum EnumSample
None = 0
FirstValue = 1
SecondValue = 2
ThirdValue = 4
FourthValue = 8
End Enum
But why is this? To really get it, we need to fall down to binary. Here's the five of them:
| None |
0000 |
| FirstValue |
0001 |
| SecondValue |
0010 |
| ThirdValue |
0100 |
| FourthValue |
1000 |
As you can see, there's only one 1 in any given column. Each column in binary represents a power of two. When two numbers are AND'd together, consider them in binary and keep only the 1s where they appear in the same column on all rows.
| FirstValue |
0001 |
| AND OtherValue |
1111 |
| = |
0001 |
We can only keep the 1 that's in the last column, as it's the only one that appears in both rows. So ANDing one number with a known bitset means that we can discover if that number contains the known bitset very easily. In the above example, (FirstValue AND OtherValue) = FirstValue. So we know we have a match. But if we redefine OtherValue:
| FirstValue |
0001 |
| AND OtherValue |
1110 |
| = |
0000 |
We now have (FirstValue AND OtherValue) <> FirstValue, so we don't have a match this time.
OR works in a similar way, except that instead of only matching 1s that appear in all rows in a column, we only need to match one.
| FirstValue |
0001 |
| OR OtherValue |
1111 |
| = |
1111 |
| FirstValue |
0001 |
| OR OtherValue |
1110 |
| = |
1111 |
In both cases we ended up with '1111'.
So now, what's the problem with using plus? Plus can lead to some wrong results. If we look at the last sample:
| FirstValue |
0001 |
| + OtherValue |
1110 |
| = |
1111 |
But for the sample before that:
| FirstValue |
0001 |
| + OtherValue |
1111 |
| = |
10000 |
Not the same at all now, is it? 0001 AND 10000 does not give us back 0001 anymore.
Poo.
On top of that, we've got a problem with size as well. Not only did the bit sets not match anymore, but we've exploded out of our 4 bit world, into the strange land of 5 bits. On a 32 or 64 bit system, this is not a problem most of the time, but if were running on a 4 bit computer, then we just went boom!
Overflows were the other main bug in the generic solution I wrote in my last post about enums. By using +, i was running the possibility of my flag enum summing code throwing an overflow exception.
But flags Enums always use powers of 2, right? Well, normally, yes. But .net does not enforce it, and there's no reason why you can't for some reason or other. In the first Enum sample I gave at the top, I might choose to coalesce SecondValue and ThirdValue into one UberValue, in which case it would have a value of 6 (0110).
So what else?
Well, in the generic sample I gave on the last post, I was only accepting Int32 values for testing. This is a problem since Enums can actually be defined as Byte, Short, Integer and Long. My sample never allowed Enums value that were larger than Int32.MaxValue to be passed in for testing, so I was locking out all those really large Enums. So I modified my new version to accept Longs as input.
I also discovered (the hard way :) that when you call System.IsDefined(System.Type, SomeNumber), if threw an exception if the underlying type of the Enum (Byte, Short, Integer or Long) did not exactly match the type of the number passed in - which in this case was now always Long.
So I had to use a bit of reflection myself to first vet the underlying type of the Enum, and forcing the value to match.
As result, my sample is a bit more Reflection-heavy than I'd planned - although I've decide that this isn't really an issue. It's very rare that this code will need to be called, so the performance hit is a small price to pay to easily work out if you have a match or not. I particularly don't like all the boxing that is going on, but such is life, I'm too busy to find a better way to improve it right now. (Post note: I just looked. There's actually only two box instructions.)
Here's my (current) end result. Given how I love to tweak and twist, I imagine it'll change again soon as I find performance improvements that I don't have time to actually waste on it :)
Public Shared Function IsValid(ByVal poType As System.Type, ByVal plValue As Int64) As Boolean
'we need real data, fool!
If poType Is Nothing OrElse Not poType.IsEnum Then
Return False
End If
'find the real type of the enum - could Byte, Short, Integer or Long
Dim oEnumType As System.Type = System.Enum.GetUnderlyingType(poType)
'box the value passed in, ready for conversion
Dim oValue As Object = plValue
'if the enum isn't already long
If Not oEnumType.Equals(GetType(Int64)) Then
'check for the type, and make sure value is within the boundaries of allowable values for that type.
If (oEnumType.Equals(GetType(Int32)) AndAlso (plValue > Int32.MaxValue OrElse plValue < Int32.MinValue)) Then
Return False
ElseIf (oEnumType.Equals(GetType(Int16)) AndAlso (plValue > Int16.MaxValue OrElse plValue < Int16.MinValue)) Then
Return False
ElseIf (oEnumType.Equals(GetType(Byte)) AndAlso (plValue > Byte.MaxValue OrElse plValue < Byte.MinValue)) Then
Return False
End If
'down convert the value to the right type
oValue = System.Convert.ChangeType(plValue, oEnumType)
End If
Dim bRet As Boolean = System.Enum.IsDefined(poType, oValue)
If bRet = False Then
If poType.GetCustomAttributes(GetType(System.FlagsAttribute), False).Length > 0 Then
Dim lRet As Int64 = MergeValues(poType)
If (plValue And lRet) = plValue Then
bRet = True
End If
End If
End If
Return bRet
End Function
Public Shared Function MergeValues(ByVal poType As System.Type) As Int64
If Not poType.IsEnum Then
Return 0
End If
Dim lRet As Int64 = 0
For Each lVal As Int64 In System.Enum.GetValues(poType)
lRet = lRet Or lVal
Next
Return lRet
End Function
Have I missed anything else in here? If so, please let me know :)
Listening to: my poor brain - foo fighters - (3:33)