Articles
XML publishing
Sarbanes-OxleyPodcast
Java and XML
SQL/XML
Portals
Webcast
Java and XML
SQL/XML
Portals
Sites
COBOL
INCITS
W3C XML
|
|
Barry Tauber of Victor Consulting is a long-time member of the committee
responsible for the COBOL standard (INCITS J4). We continue his
explanation of the integration of XML
support into the COBOL standard.
<< Part 1
There are COBOL vendors, not one COBOL vendor; and so there is
competition in the marketplace. Not only for the language, but within the
language. Without meaning to slight anyone, some of the COBOL compiler
vendors are: IBM, Micro Focus, Fujitsu, AcuCorp, LegacyJ. The vendors like
having a standard, but they like more having customers. When their customers
began requesting XML processing the vendors responded, but without a
standard they had little guidance.
Since the various individual vendors had been processing XML data for
quite a while, there were several implementations that were both functional
and used in production in commercial applications. They differed in minor
aspects, and basically what the committee is doing is picking best-of-breed
of what already works and formalizing it into the standard.
What COBOL has actually done is define how the data is to be processed
once it has been accessed. It is not a SAX model and it is not exactly a DOM
model, it is native COBOL processing. There is a parser, but it is not seen
by the COBOL code.
For those of you who have some COBOL under your belt, here is a nutshell
of how XML is handled in COBOL:
External data connectors are defined in COBOL in the Environment
Division.
Data itself is defined in the Data Division.
COBOL has added to the FD structure in the File section (which
connects with the SELECT clause) to
define the tagged layout of the XML. This layout uses COBOL terminology
of level numbers and data names. Repeating structures use the standard
COBOL occurs clause. There are new COBOL keywords
IDENTIFIED and
ATTRIBUTE for tags and attributes. Traditionally, COBOL handled
UTF-8 and ASCII with PIC X clauses and
still does. With the 2002 standard and the introduction of
PIC N (National) characters, UniCode and
UTF-16 are handled.
The structure can be hand coded to map to the physical XML structure
but mostly is not hand coded. All the vendors provide an external
utility to read the DTD, Schema, or physical XML and generate a COBOL
layout.
Data is processed in the Procedure Division.
The XML FD is OPENed or
CLOSEd (standard COBOL verbs). An open
of an XML document sucks in the whole document, DOM style, and resolves
any references, entities, etc. PIs are not passed through. The structure
defined in the Data Division serves as the window to the internal
structure. Standard COBOL verbs READ,
WRITE, REWRITE,
DELETE, and START
are used to traverse the internal structure. A form of Close can write
the modified internal structure back out to wherever it came from.
That is the nutshell. The TR is the detail, and is written as changes
against the 2002 standard. The 2002 standard itself is available at ANSI, or
you can contact me via e-mail
There were a bunch of companies and a bunch of folks involved here and I
cannot name all the names here (some do not want to be named). The people
responsible for checking for errors, making sure changes work and do not invalidate other working code
are:
- Nick Tindall (IBM)
- Jeff Lanam (Hewlett Packard)
- Ann Bennett (IBM)
- Barry Rosetti (Micro Focus)
- Don Schricker (Micro Focus)
- Steve Miller (IBM)
About the author
Barry Tauber is a principal of Victor Consulting and the International
Representative of the INCITS J4 Programming Language COBOL committee.
Resources
INCITS
J4 Programming Language COBOL annual report
Draft
specification (XML and COBOL) |
|
|