This section describes functions which import XML data. Two separate sets of functions implement two approaches to parse XML data:
With both approaches, creation and modification of the document are not possible.
Two opaque types are implemented: DOM nodes (including document, element and text nodes), and attribute lists. A document node object is created with the functions xmlreadstring (XML string) or xmlread (XML file or other input channel). Other DOM nodes and attribute lists are obtained by using DOM methods and properties.
Method | Description |
---|---|
fieldnames | List of property names |
getElementById | Get a node specified by id |
getElementsByTagName | Get a list of all descendent nodes of the given tag name |
subsref | Get a property value |
xmlrelease | Release a document node |
Property | Description |
---|---|
attributes | Attribute list (opaque object) |
childElementCount | Number of element children |
childNodes | List of child nodes |
children | List of element child nodes |
depth | Node depth in document tree |
documentElement | Root element of a document node |
firstChild | First child node |
firstElementChild | First element child node |
lastChild | Last child node |
lastElementChild | Last element child node |
line | Line number in original XML document |
nextElementSibling | Next sibling element node |
nextSibling | Next sibling node |
nodeName | Node tag name, '#document', or '#text' |
nodeValue | Text of a text node |
offset | Offset in original XML document |
ownerDocument | Owner DOM document node |
parentNode | Parent node |
previousElementSibling | Previous sibling element node |
previousSibling | Previous sibling node |
textContent | Concatenated text of all descendent text nodes |
xml | XML representation, including all children |
A document node object is released with the xmlrelease method. Once a document node object is released, all associated node objects become invalid. Attribute lists and native LME types (strings and numbers) remain valid.
Method | Description |
---|---|
fieldnames | List of attribute names |
length | Number of attributes |
subsref | Get an attribute |
Properties of attribute lists are the attribute values as strings. Properties whose name is compatible with LME field name syntax can be retrieved with the dot syntax, such as attr.id. For names containing invalid characters, such as accented letters, or to enumerate unknown attributes, attributes can be accessed with indexing, with either parenthesis or braces. The result is a structure with two fields name and value.
XML is read from a file descriptor, typically obtained with fopen. The next event is retrieved with saxnext which returns its description in a structure.
Get a node specified by id.
node = getElementById(root, id)
getElementById(root,id) gets the node which is a descendant of node root and whose attribute id matches argument id. It throws an error if the node is not found.
In valid XML documents, every id must be unique. If the document is invalid, the first element with the specified id is obtained.
Get a list of all descendent nodes of the given tag name.
node = getElementsByTagName(root, name)
getElementsByTagName(root,name) collects a list of all the element nodes which are direct or indirect descendants of node root and whose name matches argument name.
doc = xmlreal('<p>Abc <b>de</b> <i>fg <b>hijk</b></i></p>'); b = getElementsByTagName(doc, 'b') b = {DOMNode,DOMNode} b2 = b{2}.xml b2 = <b>hijk</b> xmlrelease(doc);
Get current line number of SAX parser.
n = saxcurrentline(sax)
saxcurrentline(sax) gets the current line of the XML file parsed by the SAX parser passed as argument. It can also be used after an error.
saxcurrentpos, saxnew, saxnext
Get current position in input stream of SAX parser.
n = saxcurrentpos(sax)
saxcurrentpos(sax) gets the current position of the XML file parsed by the SAX parser passed as argument (the number of bytes consumed thus far). It can also be used after an error.
The value given by saxcurrentpos differs from the result of ftell on the file descriptor, because the SAX parser input is buffered.
saxcurrentline, saxnew, saxnext
Create a new SAX parser.
sax = saxnew(fd) sax = saxnew(fd, Trim=t, HTML=h)
saxnew(fd) create a new SAX parser to parse XML from file descriptor fd. The parser is an opaque (non-numeric) type. Once it is not needed anymore, it should be released with the saxrelease function.
Named argument Trim (a boolean value) specifies if white spaces are trimmed around tags. The default value is false.
Named argument HTML (a boolean value) specifies HTML mode. The default value is false (XML mode). HTML mode has the following differences with respect to XML mode:
This can be used for the lowest level of a rudimentary HTML parser.
fd = fopen('data.xml'); sax = saxnew(fd); while true ev = saxnext(sax); switch ev.event case 'docBegin' // beginning of document case 'docEnd' // end of document break; case 'elBegin' // beginning of element ev.tag with attr ev.attr case 'elEnd' // end of element ev.tag case 'elEmpty' // empty element ev.tag with attr ev.attr case 'text' // text element ev.text end end saxrelease(sax); fclose(fd);
Get next SAX event.
event = saxnext(sax)
saxnext(sax) gets the next SAX event and returns its description in a structure. Argument sax is the SAX parser created with saxnew.
The event structure contains the following fields:
Release a SAX parser.
saxrelease(sax)
saxrelease(sax) releases the SAX parser sax created with saxnew.
Load a DOM document object from a file descriptor.
doc = xmlread(fd)
xmlread(fd) loads XML to a new DOM document node object by reading a file descriptor until the end, and returns a new document node object. The file descriptor can be closed before the document node object is used. Once the document is not needed anymore, it should be released with the xmlrelease method.
Load an XML file 'doc.xml' (this assumes support for files with the function fopen).
fd = fopen('doc.xml'); doc = xmlread(fd); fclose(fd); root = doc.documentElement; ... xmlrelease(doc);
xmlreadstring, xmlrelease, saxnew
Parse an XML string into a DOM document object.
doc = xmlreadstring(str)
xmlreadstring(str) parses XML from a string to a new DOM document node object. Once the document is not needed anymore, it should be released with the xmlrelease method.
xml = '<a>one <b id="x">two</b> <c id="y" num="3">three</c></a>'; doc = xmlreadstring(xml) doc = DOM document root = doc.documentElement; root.nodeName ans = a root.childNodes{1}.nodeValue ans = one root.childNodes{2}.xml ans = <b id="x">two</b> a = root.childNodes{2}.attributes a = DOM attributes (1 item) a.id x getElementById(doc,'y').xml <c id="y" num="3">three</c> xmlrelease(doc);
Release a DOM document object.
xmlrelease(doc)
xmlrelease(doc) releases a DOM document object. All DOM node objects obtained directly or indirectly from it become invalid.
Releasing a node which is not a document has no effect.