Methods for reading XML in Java

DOM (Document Object Model) - object - reads XML, recreating it in memory in the form of an object structure, while the XML document is represented as a set of tags - nodes. Each node can have an unlimited number of child nodes. Each child can also contain several levels of descendants or none at all. Thus, the result is a kind of tree.

Pluses:

  • Easy to program.
  • If there are many objects in XML with cross-references to each other, it is enough to go through the document twice: first time create objects without references and fill in the name-object dictionary, the second time - restore links.
  • If an error occurs in XML, a half-created XML structure remains in memory, which will be automatically destroyed.
  • Suitable for both reading and writing.

Minuses:

  • Low speed.
  • Wastes a lot of memory.

SAX (Simple API for XML) - event-driven - reads an XML document, responding to emerging events (opening or closing tag, string, attribute) by calling the event handlers provided by the application. At the same time, unlike the DOM, it does not store the document in memory.

Pluses:

  • High speed of work
  • Low memory usage.

Minuses:

  • If there are many objects in XML with cross-references to each other, it is necessary to organize temporary storage of string references, so that later, when the document is read, converted into pointers.
  • In case of an error in XML, the half-created structure of the subject branch remains in memory; the programmer must correctly destroy it with his own hands.
  • Read-only.
  • Quite difficult to program.

StAX (Stream API for XML) streaming - consisting of two sets of APIs for processing XML that provide different levels of abstraction. The cursor API allows applications to treat XML as a stream of tokens (or events); the application can check the status of the parser and get information about the last token parsed, and then move on to the next one. The second, high-level API, using event iterators, allows an application to process XML as a series of event objects, each of which interacts with a piece of the application's XML structure. All the application has to do is determine the type of the parsed event, assign it to the appropriate concrete type, and use the appropriate methods to retrieve information related to the event.

Pluses:

  • Not based on handler callbacks, the application does not have to maintain the emulated analyzer state as it does with SAX.
  • Retains the advantages that SAX has over DOM.

Minuses:

  • Read-only.

When should you use the DOM and when should you use SAX, StAX parsers?

The DOM is a natural choice when XML itself is the object of the domain: when you need to know and be able to change the structure of the document, as well as when information from the document is reused.

For fast, one-time reading, SAX or StAX is optimal.

How to write XML

Direct Writing - Writes XML tag by tag, attribute by attribute.

Pluses:

  • High speed of work.
  • Memory saving: no intermediate objects are created when used.

Minuses:

  • Write-only.

DOM (Document Object Model) Record - Creates a complete XML structure and only then writes it.

Pluses:

  • Suitable for both writing and reading.

Minuses:

  • Low speed.
  • Not optimal memory consumption.

JAXP

JAXP, The Java API for XML Processing, is a set of APIs that simplify the processing of XML data in programs written in Java. Contains implementations of DOM, SAX and StAX parsers, supports XSLT and the ability to work with DTDs.

XSLT

XSLT, eXtensible Stylesheet Language Transformations is a language for transforming XML documents.

XSLT was created for use in XSL (eXtensible Stylesheet Language), a stylesheet language for XML. During XSL transformation, the XSLT processor reads the XML document and XSLT stylesheet(s). Based on the instructions that the processor finds in the XSLT style sheet(s), it generates a new XML document or portion of it.


Read also:


Comments

Popular posts from this blog

XML, well-formed XML and valid XML

ArrayList and LinkedList in Java, memory usage and speed