AbstractThis article shows how to parse existing XML files using the DOM API and how to save a DOM tree to disk as an XML file with the proper encoding. |
This use case shows how to go back and forth between XML files and in-memory DOM trees. You will first learn to construct an in-memory DOM tree from an XML file using a DOM parser. You will learn to configure the parser to make it validating or non validating. The use case will then show you how to create an XML file from an in-memory DOM tree. It will show you how to configure the parser to obtain an XML file with the encoding you want.
[Top]
The CAAXMLDOMTranscode Use Case is a use case of the CAAXMLParser.edu framework that illustrates XMLParser framework capabilities.
[Top]
This use case implements a simple XML file transcoder: a program, which converts XML files from one encoding to another. The transcoder parses an existing XML file using the DOM API, then saves it under the specified name using the requested encoding.
[Top]
To launch CAAXMLDOMTranscode, you will need to set up the build time environment, then compile CAAXMLDOMTranscode along with its prerequisites, set up the run time environment, and then execute the use case [1].
The use case should be launched as follows from the command line:
CAAXMLDOMTranscode (-utf8|-utf16) <infilepath> <outfilepath>
where:
(-utf8|-utf16)
is the choice of the encoding to use
for the transcoded XML file.<infilepath>
is the path of the XML file which you
want to read.<outfilepath>
is the path where the transcoded XML
file will be saved.A sample XML file is provided with the use case. To use it, launch the following command from the command line:
Windows | CAAXMLDOMTranscode -utf16
InstallRoot\OS\resources\xml\CAAXMLDOMTranscode\CAAXMLDOMTranscode.xml
C:\TEMP\CAAXMLDOMTranscode_utf16.xml |
Unix | CAAXMLDOMTranscode -utf16
InstallRoot/OS/resources/xml/CAAXMLDOMTranscode/CAAXMLDOMTranscode.xml
/tmp/CAAXMLDOMTranscode_utf16.xml |
where:
InstallRoot
is the directory in which you have
installed the run time part or the product lineOS
is the directory containing the installed code
aix_a
for 32-bit AIXhpux_b
for HP-UXsolaris_a
for Solarisintel_a
for 32-bit Windowswin_b64
for 64-bit Windows[Top]
The CAAXMLDOMTranscode use case is made of one file located in the CAAXMLDOMTranscode.m module of the CAAXMLParser.edu framework:
Windows | InstallRootDirectory\CAAXMLParser.edu\CAAXMLDOMTranscode.m\ |
Unix | InstallRootDirectory/CAAXMLParser.edu/CAAXMLDOMTranscode.m/ |
where InstallRootDirectory
is the directory where the
CAA CD-ROM is installed.
[Top]
To parse a DOM document and save it to disk, there are four main steps:
# |
Step |
---|---|
1 | Create the V5 DOM Component |
2 | Parse the XML Document Without Validation |
3 | Write DOM Tree With the Proper Encoding |
4 | Manage Errors |
[Top]
CATIXMLDOMDocumentBuilder_var builder; HRESULT hr = ::CreateCATIXMLDOMDocumentBuilder(builder); ... |
The first step to work with DOM is to instantiate the V5 DOM
component. The V5 DOM component can be created by calling the CreateCATIXMLDOMDocumentBuilder
global function. This function returns a V5 handler on the CATIXMLDOMDocumentBuilder
interface, which is the main interface for the V5 DOM component. Using
this interface you will be able to create documents (either by parsing
an existing XML file, as here, or from scratch) and save existing
documents to disk. Note that the code above does not specify the CLSID
of the component to use, so the default DOM component (XML4C3) will be
used. See [3] and [4] if you want to
use another V5 DOM component.
[Top]
... CATListOfCATUnicodeString readOptions; readOptions.Append("CATDoValidation"); CATListOfCATUnicodeString readOptionValues; readOptionValues.Append("false"); CATIDOMDocument_var document; hr = builder->Parse(inputFile, document, readOptions, readOptionValues); ... |
To parse an XML document, invoke the Parse
method from the CATIXMLDOMDocumentBuilder
interface. The parser can run in two modes: non-validating and
validating. You determine what mode is used in the Parse
method
using the "CATDoValidation"
option. Options are passed to the
parser using two CATListOfCATUnicodeStrings. The first one
contains the option names, the second one contains the option values.
For the purpose of this use case, we just want to transcode the XML file
from one encoding to another, so we will disconnect XML validation. For
a discussion of non-validating parsers versus validating parsers and how
to choose which parser to instantiate, please see [3]
and [4].
[Top]
... CATListOfCATUnicodeString writeOptions; writeOptions.Append("CATEncoding"); CATListOfCATUnicodeString writeOptionValues; if (strcmp(argv[1], "-utf8") == 0) { writeOptionValues.Append("UTF-8"); } else { writeOptionValues.Append("UTF-16"); } hr = builder->WriteToFile(document, outputFile, writeOptions, writeOptionValues); ... |
To create the XML document, which corresponds to the DOM tree, call
the WriteToFile
method. It takes as a parameter the path of the
XML document to be created. The WriteToFile
method also accepts
the CATEncoding
option to control the encoding used in the
resulting file. Options are passed to the parser using two CATListOfCATUnicodeStrings.
The first one contains the option names, the second one contains the
option values.
If the CATEncoding
option is not specified, the resulting
document will use the default (UTF-8) encoding. Note that the "encoding"
attribute will not be specified in the XML declaration. However, the XML
specification indicates that if the encoding attribute is not specified,
XML parsers should consider that the document uses the UTF-8 encoding. Not
all encodings are supported by the parser. For a discussion of supported
encodings and write options, see [3] and [4].
[Top]
The XMLParser framework uses the HRESULT / CATError mechanism to
manage errors. Make sure to use the CATError::CATGetLastError
to obtain all the available error diagnostics when using XMLParser. More
information about V5 error management is available here [2]
and [4].
[Top]
This use case shows you how to read and write XML documents using the DOM API.
[Top]
[1] | Building and Launching a CAA V5 Use Case |
[2] | Managing Errors Using HRESULT |
[3] | Using XML in V5 |
[4] | XML Tips and Tricks |
[Top] |
Version: 1 [May 2005] | Document created |
[Top] |
Copyright © 2005, Dassault Systèmes. All rights reserved.