ProductivityAndFun Download Trial - Free 14 day evaluation

Working with CDATA

PDF Print E-mail
User Rating:  / 30
Rate this article: PoorBest 

Let's dig into the processing and validation of CDATA sections in your XML, often used to embed blocks of XML as strings inside an existing XML structure. Specifically we are going to look at:

  • Property Transfers -> How transfer values in to or out from an embedded XML block
  • Assertions -> How to use the standard XPath assertion to assert embedded XML content
  • Validations -> How to create scripts to validate the XML of these strings given that you have the schema available

And in the end we're gonna look at how a soapUI Pro Event Handler can make all this much easier

1. CDATA Background

CDATA sections are used in XML documents to escape longer blocks of text that could otherwise be interpreted as markup, for example

<message><![CDATA[<data>some embedded xml</data>]]></message>

Here the string "<data>some embedded xml</data>" is just that; a string, and not XML. Another way of writing this could be

<message>&lt;data&gt;some embedded xml&lt;/data&gt;</message>

Which is 100% equivalent to the previous version using CDATA; parsing either of these with some parser would return the content as a string and not parsed out as XML.

What if the embedded XML contains a CDATA section? Wouldn't the embedded ]]> terminate the outher <![CDATA[ ? Yes it would! so you can't embedded a CDATA straight off, but will need to temporarily terminate the outer CDATA to be able to pull this off. Let's say we have the following string:

<data>some embedded xml <![CDATA[<text>with xml</text>]]></data>

and want to put this in an XML document. The result could be either

<message>&lt;data&gt;some embedded xml &lt;![CDATA[&lt;text&gt;with xml&lt;/text&gt;]]&gt;&lt;/data&gt;</message>

with standard XML entities, or (pay attention now..)

<message><![CDATA[<data>some embedded xml <![CDATA[<text>with xml</text>]]]]>><![CDATA[</data>]]></message>

Confused? The first CDATA section wraps the following characters: "<data>some embedded xml <![CDATA[<text>with xml</text>]]" (notice the missing terminating '>' which would have turned the last three characters into a CDATA terminator), then comes a single ">" (which doesn't need to be entitized into &gt; since it can't be mistaken for any markup), and then another CDATA containing the string "</data>". Assembling these three strings gives us the original, and so will a parsing XML processor with either method.

2. CDATA in soapUI

It is (unfortunately) quite common that SOAP messages contain some part of the payload in a request or response as a string and not as XML, which has both advantages and disadvantages. In soapUI these XML strings are not easily validated against a schema (scripting required!), they are not easily asserted with XPath, and using them as targets/sources for property transfers is difficult since they are strings, not XML. Also the extended message viewers in soapUI Pro (Outline, Overview) show these as strings and not as markup, which can be confusing.

Lets say we have the following response message for an item search:

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:sam="http://www.example.org/sample/">
   <soapenv:Header/>
   <soapenv:Body>
      <sam:searchResponse>
         <sam:searchResponse>
            <item><id>1234</id><description><![CDATA[<item><width>123</width><height>345</height>
<length>098</length><isle>A34</isle></item>]]></description><price>123</price>
            </item>
         </sam:searchResponse>
      </sam:searchResponse>
   </soapenv:Body>
</soapenv:Envelope>

Here you see the description of the item being embedded as an XML String. In the soapUI Pro Outline and Overview editors this shows up as:

overviewsample

and

outlinesample

Not very user-friendly!

Fortunately there are some workarounds available.

3. Property Transfers and CDATA

As you know, Property-Transfers are TestSteps for transferring property values between requests, responses, properties, etc (read more in the User Guide). A common scenario is the requirement to transfer a value from a response message to the following request (for example a session id). In the standard case this is straight-forward; set the source/target of the property-transfer to the desired message property and specify an XPath statement to select the desired source/target element (in soapUI Pro all this is done with point-and-click wizards). But in our scenario, the XPath of the property-transfer can only point at the element containing the XML message string, and not "inside" it (since it is just a string), so what to do? The solution is to use temporary properties:

For Property-Transfer sources being "inside" a CDATA xml

  1. Add a temporary property to your TestCase
  2. Add one property-transfer that transfers the XML message string to this temporary property
  3. Add another property-transfer that has this temporary as source; since now it is a standalone XML string it can be parsed as such and an XPath for this source will work just fine.

For Property-Transfer targets being "inside" a CDATA block, a similar approach works out:

  1. Add a temporary property to your TestCase
  2. Add one property-transfer that transfers the target XML string to this temporary property
  3. Add another Property Transfer that transfers the desired value "into" the xml of this property (which is now handled as a standalone XML string)
  4. Finally a last Property-Transfer that transfers the modified temporary property back into the original target.

Lets combine both of these into an example; lets say we want to transfer the embedded "isle" value in the example message above into the following search query, also containing embedded XML;

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"xmlns:sam="http://www.example.org/sample/">
   <soapenv:Header/>
   <soapenv:Body>
     <sam:search>
        <sessionid>123</sessionid>
        <searchstring><![CDATA[<isle>?</isle>]]></searchstring>
     </sam:search>
   </soapenv:Body>
</soapenv:Envelope>

We have our TestCase with the two requests, start by adding two temporary properties to the TestCase (one for each intermediate XML);

testcase1

Now insert a Property-Transfer step between the requests and configure it as follows:

1) Create the first transfer that transfer the CDATA section in the response (in the description element) to the Temp1 property:

transfer1

2) Create a second transfer that transfer the CDATA section of the request (in the searchstring element) to the Temp2 property

transfer2

3) Now we have both CDATA sections as strings; create a transfer that transfers the isle value from the Temp1 property to the Temp2 property

transfer3

4) So now we have the desired value of the searchstring in the Temp2 property; transfer that back to the request with our last transfer:

transfer4

Mission accomplished! Running these four transfers will effectively extract the desired value from the embedded XML and write it into the embedded XML in the request (as you can see in the property-transfer log at the bottom of the above screenshots).

Agreeably, this still seems a bit much work, couldn't we do it with a script instead? Sure, lets have a look what that script would look like (in groovy):

// create holder for source
def description = context.expand( '${Request 1#Response#//sam:searchResponse[1]' +
   '/sam:searchResponse[1]/item[1]/description[1]}' )

def descHolder = new com.eviware.soapui.support.XmlHolder( description )

// create holder for target
def groovyUtils = new com.eviware.soapui.support.GroovyUtils( context )
def holder = groovyUtils.getXmlHolder( "Request 2#Request" )

// transfer value and save
holder["//searchstring"] = descHolder["//isle"]
holder.updateProperty()

This doesn't require any temporary properties and we could do some assertions on the way, the choice is yours!

4. XPath Assertions and CDATA

Ok, how about assertions? The standard XPath processor will just see the XML string as any old string and not parse it as XML so we can assert it using the standard XPath possibility. What to do? I can't come up with anything better than a script-assertion (except the Event Handler further down); fortunately soapUI Pro has a wizard for creating these rather easily; right click on the desired node to assert (the one containing the XML string) in the Outline View and select "Add Assertion -> for Existence with Script";

scriptassertionwizard

soapUI will generate the following script for you (if you don't have soapUI Pro, just add a Script Assertion manually and enter the below script);

import com.eviware.soapui.support.XmlHolder

def holder = new XmlHolder( messageExchange.responseContentAsXml )
holder.namespaces["sam"] = "http://www.example.org/sample/"
def node = holder.getDomNode( "//sam:searchResponse[1]/sam:searchResponse[1]" +
   "/item[1]/description[1]" )

assert node != null

Lets modify this a bit and assert that the isle value starts with an A followed by two digits:

import com.eviware.soapui.support.XmlHolder

def holder = new XmlHolder( messageExchange.responseContentAsXml )
holder.namespaces["sam"] = "http://www.example.org/sample/"

def node = holder["//sam:searchResponse[1]/sam:searchResponse[1]/item[1]/description[1]"]
def descHolder = new XmlHolder( node )
def isle = descHolder["//isle"]

assert isle.length() == 3
assert isle.charAt( 0 ) == 'A'
assert Character.isDigit( isle.charAt( 1 ))
assert Character.isDigit( isle.charAt( 2 ))

This still requires a bit of coding, but it at least makes it possible. The choice is yours!

5. Validation of CDATA Content

Finally we'll look at validation; the schema of the message only defines the XML string as a string and not its complex content; a script will be our solution here as well. Use the same wizard/methodology as described above to extract the value in a script-assertion, then add the following which will load an XSD from the file system and validate the XML in the description:

import com.eviware.soapui.support.XmlHolder
import javax.xml.XMLConstants
import javax.xml.transform.stream.StreamSource
import javax.xml.validation.SchemaFactory

def holder = new XmlHolder( messageExchange.responseContentAsXml )
holder.namespaces["sam"] = "http://www.example.org/sample/"
def node = holder["//sam:searchResponse[1]/sam:searchResponse[1]/item[1]/description[1]"]

def factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
def schema = factory.newSchema(new StreamSource(new FileReader("..")))
def validator = schema.newValidator()
validator.validate(new StreamSource(new StringReader(node)))

The XSD being

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="" elementFormDefault="unqualified">
   <element name="item">
   <complexType>
      <sequence>
         <element name="width" type="long"></element>
         <element name="height" type="long"></element>
         <element name="length" type="long"></element>
         <element name="isle" type="string"></element>
      </sequence>
   </complexType>
   </element>
</schema>

This can come in handy if you want to validate REST/HTTP requests which don't have a formalized schema, and since groovy can validate by DTD and RelaxNG as well this could be performed equally (check out http://groovy.codehaus.org/Processing+XML for examples).

6. An Event Handler to the Rescue

Wouldn't it be nice if we could just remove those CDATA tags before soapUI processes the response so it is seen as standard XML? Sure, it wouldn't be compliant with the original schema, but it would make transfers and assertions so much easier. Well, once again, Event Handlers in soapUI Pro can do this for us; Open the Project window, select the "Events" tab and add a RequestFilter.afterRequest handler. Set its content to:

def content = context.httpResponse.responseContent
content = content.replaceAll( "<!\\[CDATA\\[", "" )
content = content.replaceAll( "]]>", "" )

//log.info( content )

context.httpResponse.responseContent = content

This effectively removes any "<![CDATA[" and "]]>" strings from the response XML, which will result in soapUI processing the entire content as XML, allowing us to view/handle responses as standard XML. For example we now in the Overview view see the "nicer" formatting;

niceoverview

and the Property-Transfer and Assertion-wizards are in place in the Outline View allowing us to create this as usual:

niceoutline

Of course this has some severe limitations; it depends on the formatting of the response to be as we want (although an improved handler could deal with this), and schema-compliance assertions will (probably) fail, but it might be what we need to get the job done, and that's all we want by the end of the day, right?