Friday, 31 January 2014

Deserializing/Serializing XML that contains xsi:type attributes (and other XML adventures)

I wanted to take an arbitrary XML format and turn it into C# classes. I even considered for a while to write my own IXmlSerializable implementation of the classes, but quickly gave up because of their large number and heavy imbrication. Before we proceed you should know that there are several ways in which to turn XML into C# classes. Here is a short list (google it to learn more):
  • In Visual Studio 2012, all you have to do is copy the XML, then go to Edit -> Paste Special -> Paste XML as classes. There is an option for pasting JSON there as well.
  • There is the xsd.exe option. This is usually shipped with the Windows SDK and you have to either add the folder to the PATH environment variable so that the utility works everywhere, or use the complete path (which depends on which version of SDK you have).
  • xsd2Code is an addon for Visual Studio which gives you an extra menu option when you right click an .xsd file in the Solution Explorer to transform it to classes
  • Other zillion custom made tools that transform the XML into whatever

Anyway, the way to turn this XML into classes manually (since I didn't like the output of any of the tools above and some were even crashing) is this:
  • Create a class that is decorated with the XmlRoot attribute. If the root element has a namespace, don't forget to specify the namespace as well. Example:
    [XmlRoot(ElementName = "RootElement", Namespace = "http://www.somesite.org/2005/someSchema", IsNullable = false)]
  • For each descendant element you create a class. You add a get/set property to the parent element class, then you decorate it with the XmlElement (or XmlAttribute, or XmlText, etc). Specify the ElementName as the exact name of the element in the source XML and the Namespace url if it is different from the namespace of the document root. Example:
    [XmlElement(ElementName = "Integer", Namespace = "http://www.somesite.org/2005/differentSchemaThanRoot")]
  • If there are supposed to be more children elements of the same type, just set the type of the property to an array or a List of the class type representing one element
  • Create an instance of an XmlSerializer using the type of the root element class as a parameter. Example:
    var serializer = new XmlSerializer(typeof(RootElementEntity));
  • Create an XmlSerializerNamespaces instance and add all the namespaces in the document to it. Example:
    var ns = new XmlSerializerNamespaces(); ns.Add("ss", "http://www.somesite.org/2005/someSchema"); ns.Add("ds", "http://www.somesite.org/2005/differentSchemaThanRoot");
  • Use the namespaces instance to serialize the class. Example: serializer.Serialize(stream, instance, ns);

The above technique serializes a RootElementEntity instance to something similar to:
<ss:RootElement xmlns:ss="http://www.somesite.org/2005/someSchema" xmlns:ds="http://www.somesite.org/2005/differentSchemaThanRoot">
<ds:Integer>10</ds:Integer>
</ss:RootElement>

Now, everything is almost good. The only problem I met doing this was trying to deserialize an XML containing xsi:type attributes. An exception of type InvalidOperationException was thrown with the message "The specified type was not recognized: name='TheType', namespace='http://www.somesite.org/2005/someschema', at " and then the XML element that caused the exception. (Note that this is an internal exception of the first InvalidOperationException thrown that just says there was an error in the XML)

I finally found the solution, even if it is not the most intuitive. You need to create a type that inherits from the type you want associated to the element. Then you need to decorate it (and the original element) with an XmlRoot attribute specifying the namespace (even if the namespace is the same as the one of the document root element). And then you need to decorate the base type with the XmlInclude attribute. Here is an example.

The XML:

<ss:RootElement xmlns:ss="http://www.somesite.org/2005/someSchema" xmlns:ds="http://www.somesite.org/2005/differentSchemaThanRoot">
<ds:Integer>10</ds:Integer>
<ss:MyType xsi:type="ss:TheType">10</ss:MyType>
</ss:RootElement>

You need to create the class for MyType then inherit TheType from it:
[XmlRoot(Namespace="http://www.somesite.org/2005/someSchema")]
[XmlInclude(typeof(TheType))]
public class MyTypeEntity {}

[XmlRoot(Namespace="http://www.somesite.org/2005/someSchema")]
public class TheType: MyTypeEntity {}
Removing any of these attributes makes the deserialization fail.

Hope this helps somebody.

5 comments:

  1. Appreciate your post. This was causing me some issues, and there is not a lot of helpful information available.

    ReplyDelete
  2. Just wanted to say thank you! This helped me a great deal. Another way is to pass the types as an extra parameter when serializing, but if you cannot do this, your technique is great!

    ReplyDelete
  3. Thank you my friend!!!

    ReplyDelete
  4. Thank You!!
    I was totally stuck on the "specified type was not recognized" exception. I already had the XmlInclude attribute on my base class, but I never would've thought to use the XmlRoot attributes.
    You saved me a ton of time.

    ReplyDelete
  5. You're welcome. It kind of wasted a lot of time for me as well.

    ReplyDelete