1.
Description |
This module
supports the construction of the W4 XML term representation, according to
XML Info Sets. XML
Namespaces and XML Base are fully supported. The
fundamental implementations decisions are the following:
- All names occurring in a XML document are
represented by Prolog constants, in UTF-8 encoding.
- All text content and attribute values are
represented by lists of Unicode character codes.
- The properties which involve referencing other information items are
not implemented ( [parent], [references],
[notation] and [owner element]).
This is motivated by the fact that some Prolog systems do not give
support to cyclic terms.
However, it is planned an extension to this
module conformant with the full
recommendation.
- XML Base and SystemID URIs (and IRIs) have a special
internal structure, as defined in module IRI, in
order to optimize resolution of relative references.
- In most situations unknown or no value properties are represented by
empty lists.
In order to maintain compatibility for subsequent versions of our
parser, all applications should use the described
predicates to extract properties from the information items.
For examples, the reader is referred to the implementation of XML
Exclusive Canonicalization and XML Term NS. |
|
2.
Representation of a XML document The W4
parser creates a term structure containing the full representation of the
read XML document. This information is stored in a term of the form
document/10, from which every information item in the XML is
accessible. The constructed representation differs in some minor aspects
from the XML Info Sets, in particular by constructing an internal
representation for the DTD. The documentation is adapted from the
specification of XML Info Sets. In the following tables, text in bold face
represents items or properties according to the XML Info Sets.
2.1
The document and its content
The Document
Information Item |
document(Children,DocumentElement,Notations,Unparsed,BaseURI,CharacterEncoding,Standalone,Version,All,DTD) |
Children: |
An ordered NodeList of child
information items, in document order. There is only
one element information item in this list (the Document Element). This
list also contains all processing instruction items, comment items,
and Document Type Declaration item occuring in the Prolog and Epilog
of the XML document. |
DocumentElement: |
The element information item corresponding to the
document element. |
Notations: |
An ordered NamedMap of notation information
items, one for each notation declared in the DTD.
The ordering key is the notation name. |
Unparsed: |
An ordered NamedMap of
unparsed entity
information items, one for each unparsed entity declared in the DTD.
The ordering key is the name of the unparsed entity. |
BaseURI: |
The base URI term of the
document entity, according to the
IRI term representation. |
CharacterEncoding: |
A constant with the name of the
character encoding scheme in which the document entity is expressed |
Standalone: |
An indication of the standalone status of the
document, either the constant yes or no.
If there is no standalone document declaration, then this argument is
set to the empty list. |
Version: |
A constant representing the XML version
of the document (currently, only '1.0').
The empty list [] if there is no XML declaration. |
All: |
The constant yes or no
indicating
whether the processor has read the complete DTD.
The empty list [] if there is no DTD. |
DTD: |
The Document Type Declaration term
representing the full DTD.
The empty list [] if there is no DTD. |
Element Information
Items |
element(NamespaceURI,
LocalName, Prefix,
Attributes,
NameAttributes, Children, InScope,
BaseURI, Lang) |
NamespaceURI: |
A constant with the namespace
name, if any, of the element type.
The empty constant '' if the element does not belong to a
namespace. |
LocalName: |
A constant representing the local part
of the element-type name. |
Prefix: |
The namespace prefix part of the element-type name.
The empty constant '' if the name is unprefixed. |
Attributes: |
An ordered NamedMap of attribute
information items, one for each of the attributes (specified or
defaulted from the DTD) of this element.
The map is ordered by the key ename(NamespaceURI,LocalName)
obtained from the NamespaceURI and LocalName of each attribute. |
NameAttributes: |
An ordered NamedMap of attribute
information items, one for each of the namespace declarations
(specified or defaulted from the DTD) of this element.
A declaration of the form xmlns="", which undeclares the
default namespace, counts as a namespace declaration. By definition,
all namespace attributes have a namespace URI of
http://www.w3.org/2000/xmlns/
The map is ordered by the key
ename('http://www.w3.org/2000/xmlns/',Prefix), where Prefix is the
constant corresponding to the declared prefix by the namespace attribute.
Prefix is the empty constant '' if the declaration is of the form
xmlns="[SOME URI]". |
Children: |
An ordered Node List of child
information items, in document order. This list contains element,
processing instruction, unexpanded entity reference, character, and
comment information items, one for each element, processing
instruction, reference to an unprocessed external entity, data
character, and comment appearing immediately within the current
element. |
InScope: |
An ordered NamedMap of namespace URIs,
one for each of the namespaces in effect for this element.
The map is ordered by Prefix, and maps the Prefix to the namespace URI
(a namespace information item). The map always contains and item with
prefix xml which is implicitly bound to the namespace name
http://www.w3.org/XML/1998/namespace.
Furthermore, and deviating from XML Info Sets, the map
also contains a value for the empty prefix '', corresponding to the
default namespace. If there is no default namespace declared the value
for '' is also the empty constant. |
BaseURI: |
The base URI term of the
element, according to the IRI term
representation. |
Lang: |
The language tag in effect for the
element, represented by a list of Unicode character codes. This might have
been declared in the element or inherited from an ancestor element
declaration. |
The current implementation does not support the
[parent] propery of Element Information Items.
Attribute Information
Items |
attribute(NamespaceURI,
LocalName, Prefix,
Value, Specified,
Type) |
NamespaceURI: |
A constant with the namespace
name, if any, of the attribute.
The empty constant '' if the attribute does
not belong to a namespace. |
LocalName: |
A constant representing the local part
of the attribute name. |
Prefix: |
The namespace prefix part of the
attribute name.
The empty constant '' if the name is unprefixed. |
Value: |
A list of Unicode character codes with the
normalized attribute value. |
Specified: |
The constant yes if this attribute
was actually specified in the start-tag of its element;
the constant no if it was defaulted from the DTD. |
Type: |
The type declared for this attribute
in the DTD. Currently, it is always set to the empty list.
The efficiency impact of supporting this property for documents
without DTDs is being evaluated. |
The current implementation does not support the
[references] and [owner element] properties of Attribute
Information Items.
Processing Instruction
Information Items |
pi(Target,Content,BaseURI) |
Target: |
A constant representing the target
part of the processing instruction. |
Content: |
A list of Unicode character codes
representing the content of the processing instruction, excluding the
target and any white space immediately following it. |
BaseURI: |
The base URI term of the PI,
according to the IRI term representation. |
The current implementation does not support the
[notation] and [parent] properties of Processing Instruction
Information Items.
Comment Information
Items |
comment(Content) |
Content: |
A list of Unicode character codes
representing the content of the comment. |
The current implementation does not support the
[parent] property of Comment Information Items.
Character Information
Items |
pcdata(Content) or
whitespace(Content) |
Content: |
A list of Unicode character codes
representing text content. If
content is white space appearing within element content,
then the function symbol is whitespace; otherwise is pcdata. |
The current implementation does not support the
[parent] property of Character Information Items.
2.2
The Document Type Declaration
In this section we describe the Document Type
Information Item and the representation of element specifications and
attribute declarations. Notice that XML Info Sets does not specify items
for the representation of information inside DTD, besides PIs.
Document Type
Declaration Information Item |
documenttype(QName,
PublicId, SystemId,
ElemDecl, AttDecl, Children ) |
QName: |
A term of the form
qname(Prefix,LocalName) representing the document element
qualified name, as it appears in the DOCTYPE declaration. |
PublicId: |
A constant with the public
identifier of the external subset, as it appears in
the DOCTYPE declaration. |
SystemId: |
An IRI ref term with the system
identifier of the external subset, as it appears in
the DOCTYPE declaration.The empty list if a system identifier is not
provided. |
ElemDecl: |
An ordered NamedMap of element content
specification items, one for each element type declaration found in
the internal subset of the Document Type Declaration.
The map is ordered by the key qname(Prefix,LocalName) obtained
from the element's qualified name being declared. |
AttDecl: |
An ordered NamedMap of attribute list
declarations, one for each element with declared attributes found in the
internal subset of the DTD.
The map is ordered by the key qname(Prefix,LocalName) obtained
from the element's qualified name. |
Children: |
An ordered NodeList of processing
instruction information items appearing in the DTD, in document order.
|
The current implementation does not support the
[parent] property of the Document Type Declaration Information Item.
Element Content
Specification Items |
spec(ContenSpec) |
ContentSpec: |
A term representing the
ContentSpecification of an element. This term has the following form:
- the constant empty
- the constant any
- the constant '#pcdata'
- the term times( seq(['#pcdata']) )
- the term times( choice( ['#pcdata'|Names])),
where Names is a non-empty list of qualified names of the form
qname(Prefix,LocalName).
- a term CP, times(CP), plus(CP),
opt(CP), where CP is a CP Term of the form choice(CP) or
seq(CP), as described below.
The CP Term represents a choice or a sequence in
the content specification with the following form:
- a qualified name term qname(Prefix,LocalName).
- a term choice(ListOfCP) or
seq(ListOfCp), where ListOfCP is a list of CP terms.
- a term seq(ListOfCp),where ListOfCP is a
list of CP terms.
- a term of the form times(CP), plus(CP),
opt(CP), where CP is a CP term.
|
The attribute list declaration items collects in the
same structure all the attribute declarations found in the DTD for a given
element. Notice that it is allowed to have an attribute declaration
without an element type declaration. However, the converse means that no
attributes may appear in an element.
Attribute List
Declaration Items |
attlist(AttDecl) |
AttDecl: |
AttDecl is an ordered NamedMap of
terms of the form attribute_decl(Type,Default). The map is
ordered by the key qname(Prefix,LocalPart) obtained from the
attribute qualified name in the DTD. The Type
argument is a term of the form:
- constant cdata, id, idref,
idrefs, entity, entities, nmtoken, nmtokens
- enum(ListOfNmtokens), where
ListOfNmtokens is a list of constants correspoding to the name
tokens (constants in UTF-8 encoding) found in the attribute
declaration.
- notations(ListOfNCNames), where ListOfNames
is a list of NCNames (constants in UTF-8 encoding).
The Default argument is a term of the form:
- the constants required, or implied
- the term fixed(Value), where Value
is a list of Unicode character codes representing the fixed attribute
value.
- the term default(Value), where Value
is a list of Unicode character codes representing the default
attribute value.
|
2.3
Entities and Notations
The XML Info Sets requires only to store the unparsed
entities and notations declared in the DTD. The internal parsed entities
and parameter entities are properly dealt with by the XML Parser, but it
is not provided any representation accessible to the user.
There is a notation information item for each
notation declared in the DTD.
Notation Information
Items |
notation(Name,
PublicId, SystemId,
BaseURI) |
Name: |
A constant with the XML Name of the
notation. |
PublicId: |
A constant with the public
identifier of notation.
The empty list if the public identifier is not provided. |
SystemId: |
The system identifier
URI, according to the IRI term
representation.
The empty list if a system identifier is not provided. |
BaseURI: |
The base URI relative to which the system identifier
should be resolved, according to the
IRI term representation. |
Unparsed entity information items are stored in the
document information item. There is one for
unparsed entity information item for each unparsed general entity
declared in the DTD.Unparsed entities are not expanded in
attribute values since they are not read. The [notation] property
of Unparsed Entity Information Items is not supported.
Unparsed Entity
Information Items |
unparsed_entity(Name,
PublicId, SystemId,
BaseURI,NotationName) |
Name: |
A constant with the XML Name of the
unparsed entity. |
PublicId: |
A constant with the public
identifier of the unparsed entity.
The empty list if the public identifier is not provided. |
SystemId: |
The system identifier
URI, according to the IRI term
representation.
The empty list if a system identifier is not provided. |
BaseURI: |
The base URI relative to which the system identifier
should be resolved, according to the
IRI term representation. |
NotationName: |
The notation name associated with the entity. |
A unexpanded entity reference information item serves as a placeholder
by which the XML processor indicates
that it has not expanded an external parsed entity. There is such an
information item for each unexpanded reference to an external general
entity within the content of an element. It is not
supported the [parent] property of Unexpanded Entity Reference
Information Items.
Unexpanded Entity
ReferenceInformation Items |
unexpanded_entity(Name,
PublicId, SystemId,
BaseURI) |
Name: |
A constant with the XML Name of the
external parsed entity. |
PublicId: |
A constant with the public
identifier of the external parsed entity.
The empty list if the public identifier is not provided. |
SystemId: |
The system identifier
URI, according to the IRI term
representation.
The empty list if a system identifier is not provided. |
BaseURI: |
The base URI relative to which the system identifier
should be resolved, according to the
IRI term representation. |
|
|
3. Usage
of the XML DOM Module The XML DOM module
is expected to be used in connection with W4 XML Parser. Currently, it is
not the intent of the XML DOM module to define an API for dynamically
constructing XML terms. Therefore, only inspection predicates are
described in this section, even though the current implementation exports
"low-level" predicates for constructing XML DOM terms. These are for
internal use of the XML Parser and should not be used in applications. A
full blown API is being devised.
3.1
Working with the document and its content
The following predicates allow the users to perform
most of the tasks required in applications. The typical application
extracts the document (or root) element from the document term and starts
processing. The first set of predicates implement operations to
Document Item Terms
- isDocument(
+Item )
Succeeds if the argument is a document item term.
-
getDocumentChildren(
+XMLDoc, ChildList )
This predicate returns in the
ChildList
argument an ordered NodeList containing the child information
item terms, in document order, of
the XML Document term XMLDoc.
There is only one element information item term in this list (the
Document Element). This list also contains all processing instruction
items, comment items, and Document Type Declaration item occuring in the
Prolog and Epilog of the XML document.
- getDocumentElement(
+XMLDoc, Element )
This predicate returns in argument
Element
the element information item
term corresponding to the document element
of the given XMLDoc document item term.
-
getDocumentNotations(
+XMLDoc, NotationMap )
This predicate returns in
argument
NotationMap an ordered Named Map
contatining the declared notation information items of the given
XMLDoc document item term. This map is
ordered by the name of the notation, a NCName, which is a constant in
UTF-8.
-
getDocumentUnparsedEntities(
+XMLDoc, UnparsedMap )
This predicate returns in argument
UnparsedMap
an ordered Named Map contatining the declared unparsed external entity
information items of the given XMLDoc
document item term. This map is ordered by the name of the unparsed
entity, a NCName, which is a constant in UTF-8.
- getDocumentBaseURI(
+XMLDoc, BaseURI )
Obtains the BaseURI of
the the given XMLDoc document item
term.The value of argument BaseURI is an
IRI reference term.
-
getDocumentEncoding(
+XMLDoc, Encoding )
Obtains the Encoding of
the the given XMLDoc document item term.
Argument BaseURI is constant with
the name of the character encoding scheme in which the document entity
is expressed
-
getDocumentStandalone(
+XMLDoc, Standalone
)
This predicate returns in argument
Standalone the indication of the
standalone status of the given the given
XMLDoc document item term. Argument
Standalone is
the constant yes or no, or the
empty list if such information was not provided in the XML declaration.
Element
Item Terms
- isElement(
+Item )
Succeeds if the argument is a XML element item
term.
- getElementName(
+EltItem, NamespaceURI,
Local, Prefix )
Given an element item tem in the argument
EltItem, this predicate returns the
NamespaceURI, the
Local part, and the
Prefix of the element's qualified name.
The NamespaceURI and
Local identify the element. The
NamespaceURI should be an absolute URI,
while Local and
Prefix are NCNames. The last three arguments are constants in
UTF-8 encoding. Both Prefix and
NamespaceURI are the empty constant
'' whenever the element does not belong
to a namespace.
- getElementChildren(
+EltItem, ChildList
)
This predicate returns in the
ChildList
argument an ordered NodeList containing the child information
item terms, in document order, of
the XML element item term EltItem.
Notice that no two character data items may appear consecutively, and
that references to unprocessed external
entity also appear in the node list.
- getElementAttributes(
+EltItem, Attributes, NSAttributes
)
Predicate getElementAtttibutes/3
returns the two ordered Named Maps Attributes
and NSAttributes of the of the XML
element item term EltItem.
Map Attributes contains all
attribute information items, one for each of the attributes (specified
or defaulted from the DTD) of this element. The
map is ordered by the key ename(NamespaceURI,LocalName) obtained
from the NamespaceURI and LocalName of each attribute.
Map NSAttributes contains
attribute information items, one for each of the namespace declarations
(specified or defaulted from the DTD) of this element.
The map is ordered by the key ename('http://www.w3.org/2000/xmlns/',Prefix),
where Prefix is the constant corresponding to the declared prefix by the
namespace attribute. Prefix is the empty constant
'' if the declaration is of the form
xmlns="[SOME URI]".
- getElementInScopeNamespaces(
+EltItem, Namespaces
)
This predicate returns in the
Namespaces
argument an ordered Named Map of namespace URIs, one for each of
the namespaces in effect for the element
item term EltItem. The map is
ordered by Prefix, and maps the Prefix to the namespace URI (a namespace
information item). The map always contains and item with prefix
xml which is implicitly bound to the
namespace name http://www.w3.org/XML/1998/namespace. The
map also contains a value for the empty prefix
'', corresponding to the default namespace. If there is no
default namespace declared, the value for
'' is also the empty constant.
- getElementBaseURI(
+EltItem, BaseURI
)
This predicate returns in the
BaseURI
argument an IRI reference term with the Base URI of the element item in
the given EltItem argument.
- getElementLang(
+EltItem, Lang
)
This predicate returns in the
Lang
argument a list of Unicode character codes with the language tag for the
element item in the given EltItem
argument.
Attribute
Item Terms
- isAttribute(
+Item )
Succeeds if the argument is a XML attribute item
term.
- getAttributeName(
+AttItem, NamespaceURI, Local, Prefix
)
Given an attribute item tem in the argument
AttItem, this predicate returns the
NamespaceURI, the
Local part, and the
Prefix of the qualified name. The
NamespaceURI and
Local identify univocally this
attribute. The NamespaceURI should
be an absolute URI, while Local and
Prefix are NCNames. The last three
arguments are constants in UTF-8 encoding. Both
Prefix and NamespaceURI are the
empty constant '' whenever the attribute
does not belong to a namespace.
- getAttributeValue(
+AttItem, Value
)
This predicate returns in the
Value argument the list of Unicode character codes corresponding
to the normalized value of the attribute item given in argument
AttItem.
- getAttributeSpecified(
+AttItem, Specified
)
This predicate returns the
Specified flag of the attribute item in argument
AttItem. The
Specified argument can take the values yes if this
attribute was actually specified in the start-tag of its element;
or no if it was defaulted from the DTD.
- getAttributeType(
+AttItem,
Type )
Obtains the Type of the
attribute item term AttItem, as declared
in the DTD, or the empty list if the attribute was not declared.
Currently, it always returns the empty list.
Processing Instruction Item
Terms
- isPI(
+Item )
Succeeds if the argument is a XML processing
instruction item term.
- getPITarget(
+PIItem,
Target
)
This predicate returns in the
Target
argument a constant with the target of the processing instruction item
provided in the PIItem argument.The
target constant is a NCName in UTF-8 encoding.
- getPIContent(
+PIItem, Content
)
This predicate returns in the
Content argument a list of Unicode character codes of the
processing instruction content in the given
PIItem term, excluding the mandatory whitespace after the target
and th final ?> delimiter.
- getPIBaseURI(
+PIItem, BaseURI
)
This predicate returns in the
BaseURI
argument an IRI reference term with the Base URI of the processing
instruction item in the given PIItem
term.
Comment Item Terms
- isComment(
+Item )
Succeeds if the argument is a XML comment item
term.
- getCommentContent(
+CommItem,
Content )
This predicate returns in the
Content argument a list of Unicode character codes of the comment
content in the given CommItem term.This
does not include the starting <!-- and finishing --> comment delimiters.
Character Information Items
- isCharData(
+Item )
Succeeds if the argument is a XML Character Data
Information item, including whitespace.
- isWhiteSpace(
+Item )
Succeeds if the argument is whitespace.
- getCharData(
+CharItem,
Content )
This predicate returns in argument
Content a list of Unicode character
codes of the text content in the CharItem
term.
3.2
Working with the Document Type Declaration
The XML DOM module provides basic support of DTDs. The
user is able to obtain all the properties of Document Type Declaration
information items, plus the specification of attributes and elements.
Currently, it is not kept any information about internal entitities and
therefore one cannot "regenerate" the original document.
Document Type Declaration Item
Terms
- isDocumentType(
+Item )
Succeeds if the argument is a Document Type
Declaration item term.
-
getDocumentTypeChildren(
+DTDItem, ChildList )
This predicate returns in the
ChildList
argument an ordered NodeList of processing instruction information items
appearing in the DTD, in document order. For future compatibility, the
users should guarantee that they are processing PIs. This property might
be extended to contain all the markup declaration the DTD.
- getDocumentTypeQualifiedName(
+DTDItem,
QName )
This predicate returns in argument
QName a term of the form
qname(Prefix,LocalName) representing the document element qualified
name, as it appears in the Document Type Declaration
item term DTDItem.
-
getDocumentTypePublicId(
+DTDItem,
PublicId )
This predicate returns in
argument PublicId a constant with
the public identifier of the external subset, as
it appears in the Document Type Declaration item term
DTDItem.
-
getDocumentTypeSystemId(
+DTDItem,
SystemId )
This predicate returns in
argument SystemId an IRI ref term
with the system identifier of the external subset,
as it appears in the Document Type Declaration
item term DTDItem.
-
getDocumentTypeElementDeclarations(
+DTDItem,
ElemDecl )
This predicate returns in
argument ElemDecl ordered
NamedMap of element content specification items, one for each element
type declaration found in the the Document Type
Declaration item term DTDItem.
The map is ordered by the key qname(Prefix,LocalName) obtained
from the element's qualified name being declared.
- getDocumentTypeAttributeDeclarations(
+DTDItem,
AttDecl )
This predicate returns in
argument AttDecl ordered NamedMap
of attribute list declarations, one for each element
with declared attributes found in the
Document Type Declaration item term
DTDItem. The map is ordered by
the key qname(Prefix,LocalName) obtained from the
element's qualified name.
Additionally, the following predicates can be used to
obtain directly the element and attribute declarations from the DTD.
- getElementSpecificationFromDTD(
+DTDItem,
+ElemQName, ElemSpec )
This predicate returns in
argument ElemSpec an
element specificaton term describing the content of
element with the qualified name ElemQName,
as it appears in the Document Type Declaration
item term DTDItem. The element qualified
name must be a term of the form qname(Prefix,Local).
-
getAttributeDeclarationFromDTD(
+DTDItem, +ElemQName,
+AttQName, Type, Default
)
This predicate returns in
arguments Type and
Default the
declaration of attribute AttQName in
element ElemQName, as it appears
in the Document Type Declaration item term
DTDItem. The element and attribute's
qualified names must be terms of the form qname(Prefix,Local).
- getDefaultAttributesFromDTD(
+DTDItem,
+QName, Attributes, NSAttributes )
This predicate returns the
ordered NamedMap with the default attributes and
Namespace attributes obtained from the Document
Type Declaration item term DTDItem, for
element QName. The element qualified
name must be a term of the form qname(Prefix,Local).
Element Specification
Terms
-
isElementSpecification( +Item
)
Succeeds if the argument is an Element
Specification item.
- getElementSpecification(
+ElemSpec, ContentSpec
)
This predicate returns in the
ContentSpec
the content specification term of the element specification item
ElemSpec. The structure of content
specification terms is described in Section 2.2
above.
Attribute List Declaration
Terms
-
isAttributeListDeclaration( +Item
)
Succeeds if the argument is an Attribute List
Declaration item.
-
getAttributeListDeclaration( +AttList,
AttMap )
Obtains the ordered NamedMap AttMap of
terms of the form attribute_decl(Type,Default). The map is
ordered by the key qname(Prefix,LocalPart) obtained from the
attribute qualified name in the DTD.
- getAttributeDeclaration(
+AttDecl, Type, Default )
This predicate returns in
arguments Type and
Default the
declaration extracted from a term of the form attribute_decl(Type,Default)
in AttDecl. The structure of type
and default value terms are described in Section 2.2
above.
- getAttributeDeclaration(
+AttList, +AttQName, Type, Default )
This predicate returns in
arguments Type and
Default the
declaration of attribute AttQName in the
attribute list declaration item provided in argument
AttList. The element and attribute's
qualified names must be terms of the form qname(Prefix,Local).
The structure of type and default value terms are described in
Section 2.2 above.
3.3
Working with Entities and Notation
The document item iterm keeps ordered Named Maps
containing the notation and unparsed entities declared in the internal
part of the DTD. Furthermore, external parsed entity references are
substituted by
- isExternalEntity(
+Item )
Succeeds if the argument is a (unexpanded)
external parsed entity reference.
- isUnparsedEntity(
+Item )
Succeeds if the argument is a unparsed entity
declaration item.
- getEntityName(
+EntItem,
NCName )
This predicate returns in argument
NCName the name of the (unparsed or
unxpanded) entity item given in argument
EntItem.
- getEntityPublicId(
+EntItem,
PublicId )
This predicate returns in
argument PublicId a constant with
the public identifier of the (unparsed or unxpanded)
entity item given in argument EntItem.
- getEntitySystemId(
+EntItem,
SystemId )
This predicate returns in
argument SystemId an IRI ref term
with the public identifier of the (unparsed or
unxpanded) entity item given in argument
EntItem.
- getEntityBaseURI(
+EntItem,
BaseURI )
This predicate returns in
argument BaseURI an IRI ref term
with the base URI relative to which the system
identifier of EntItem
term should be resolved.
- getEntityNotationName(
+EntItem,
NotName )
This predicate returns in
argument NotName the NCName notation name associated with the entity
term EntItem.
Notation Item
Terms
- isNotation(
+Item )
Succeeds if the argument is a Notation item term.
- getNotationName(
+NotItem,
NCName )
This predicate returns in argument
NCName the name of the notation item
given in argument NotItem.
-
getNotationPublicId(
+NotItem,
PublicId )
This predicate returns in
argument PublicId a constant with
the public identifier of the notation item term
NotItem.
-
getNotationSystemId(
+NotItem,
SystemId )
This predicate returns in
argument SystemId an IRI ref term
with the public identifier of the notation item term
NotItem.
- getNotationBaseURI(
+NotItem,
BaseURI )
This predicate returns in
argument BaseURI an IRI ref term
with the base URI relative to which the system
identifier of NotItem
term should be resolved.
3.4
Working with Node lists and Named Maps
The XML DOM representation uses node lists and named
maps to represent several properties of the XML Info Sets. Node Lists are
used to repre. Even though, both Node Lists and Named Maps are ordinary
lists, we suggest to use the following predicates to traverse them in
order to guarantee compatibility with future version of the XML DOM
Module.
Node Lists
Node lists are used in the XML DOM module to represent
Children of the Document Item, Element Item, and Document Type Definition
items. The iterations are programmed using the predicates
getHeadNodeList/2,
getTailNodeList/2 and
isEmptyNodeList/1.
- isNodeList(
+Item )
Succeeds if the argument is a Node List term.
-
isEmptyNodeList(
+NodeList )
Succeeds if the argument is an empty Node List term.
- getHeadNodeList(
+NodeList,
Item )
This predicate returns in argument
Item the first element of
NodeList. Fails if
NodeList is empty.
- getTailNodeList(
+NodeList,
Tail )
This predicate returns in
argument Tail the node list obtained by
removing the first element in NodeList.
Fails if NodeList is empty.
Ordered Named Maps
Ordered Named Mape are used in the XML DOM module to
represent ordered list of unparsed entities and notation items in the
Documen item; Attributes, Namespace attributes and in scope namespace
items in element items; and element and attribute list declarations in
document type definition items. The iterations are programmed using the
predicates getFirstNamedMap/3,
getRestNamedMap/2 and
isEmptyNamedMap/1. The Named Map is
ordered by a complex term key, using the usual Prolog
@< term ordering. Additionally, it is
provided a predicate for searching the named map.
- isNamedMap(
+Item )
Succeeds if the argument is an ordered Named Map.
-
isEmptyNamedMap(
+NamedMap )
Succeeds if the argument is an empty Named Map.
- getFirstNamedMap(
+NamedMap,
Item )
This predicate returns in argument
Item the first element of
NamedMap. Fails if
NamedMap is empty.
- getFirstNamedMap(
+NamedMap,
Key, Item )
This predicate returns in argument
Item the first element of
NamedMap, and the corresponding key in
the second argument. Fails if NamedMap
is empty.
- getRestNamedMap(
+NamedMap,
Tail )
This predicate returns in
argument Tail the ordered map obtained
by removing the first element in NamedMap.
Fails if NamedMap is empty.
- getNamedItem(
+NamedMap,
+ Key, Item )
This predicate returns in argument
Item the element of
NamedMap with the key provided in the
second argument. Fails if NamedMap does
not contain an element with this key.
|
|
4.
Sample Code
The processing of XML documents in Prolog is
rather straightforward, but sometimes boresome. In this section, it is
presented sample code for writing a XML document term to a stream. The
user can easily adapt the code for iterating over the several item term
types. This sample code is a subset of our module XML Write, which might
be used as a general template for XML Document term processing. The code
below must be complemented with the definition of predicates
writeString/2, writeEscapedString/2 and
writeEscapedAttributedValue/2.
writeSimpleXML( Stream, Item ) :-
isDocument( Item ), !,
writeXMLDocumentItem( Stream, Item ).
% Only Writes the document item children.
%The XML Declaration and the DTD are ignored.
writeXMLDocumentItem( Stream, Doc ) :-
getDocumentChildren( Doc, Children ),
writeXMLNodeList( Stream, Children ).
% Iterates over the items in the NodeList
writeXMLNodeList( Stream, NodeList ) :-
getHeadNodeList( NodeList, Item ), !,
( isElement( Item ) -> writeXMLElement( Stream, Item )
; isCharData( Item ) -> writeXMLCharData( Stream, Item )
; isComment( Item ) -> writeXMLComment( Stream, Item )
; isPI( Item ) -> writeXMLPI( Stream, Item )
; isExternalEntity( Item )-> writeXMLEntityReference( Stream, Item )
; otherwise -> true
),
getTailNodeList( NodeList, RestNodeList ), !,
writeXMLNodeList( Stream, RestNodeList ).
writeXMLNodeList( _, NodeList ) :-
isEmptyNodeList( NodeList ).
% Writes an element
writeXMLElement( Stream, EltItem) :-
getElementName( EltItem, _, Local, Prefix ),
write( Stream, '<' ),
writeQName( Stream, Prefix, Local ),
getElementAttributes( EltItem, Attributes, NSAttributes ),
writeXMLAttributes( Stream, NSAttributes ),
writeXMLAttributes( Stream, Attributes ),
write( Stream, '>' ),
getElementChildren( EltItem, Children ),
writeXMLNodeList( Stream, Children ),
write( Stream, '</' ),
writeQName( Stream, Prefix, Local ),
write( Stream, '>' ).
% Iterates over the atrtibutes and writes them
writeXMLAttributes( Stream, NamedMap ) :-
getFirstNamedMap( NamedMap, _, Att ), !,
writeXMLAttribute( Stream, Att ),
getRestNamedMap( NamedMap, RestNamedMap ), !,
writeXMLAttributes( Stream, RestNamedMap ).
writeXMLAttributes( _, NamedMap ) :-
isEmptyNamedMap( NamedMap ).
writeXMLAttribute( Stream, Att ) :-
getAttributeName( Att, _, Local, Prefix ),
write( Stream, ' ' ),
writeQName( Stream, Prefix, Local ),
getAttributeValue( Att, NormValue ),
write( Stream, '="' ),
writeEscapedAttributeValue( Stream, NormValue ),
write( Stream, '"' ).
writeXMLCharData( Stream, CharData ) :-
getCharData( CharData, Content ),
writeEscapedString( Stream, Content ).
writeXMLComment( Stream, Comment ) :-
getCommentContent( Comment, Text ),
write( Stream, '<!--' ),
writeString( Stream, Text ),
write( Stream, '-->' ).
writeXMLPI( Stream, PI ) :-
getPITarget( PI, Target ),
getPIContent( PI, Content ),
write( Stream, '<?' ),
writeNCName( Stream, Target ),
( Content \= [] -> write( Stream, ' ' ),
writeString( Stream, Content )
; true
),
write( Stream, '?>' ).
writeXMLEntityReference( Stream, EntityRef ) :-
getEntityName( EntityRef, EntName ),
write( Stream, '&' ),
writeNCName( Stream, EntName ),
write( Stream, ';' ).
% NCNames are already in UTF-8 encoding
writeNCName( Stream, Name ) :-
write( Stream, Name ).
% The prefix and local parts of a qualified name are already in UTF-8 encoding.
% Notice how the mepty prefix is tested.
writeQName( Stream, Prefix, Local ) :-
( Prefix = '' ->
write( Stream, Local )
; write( Stream, Prefix ),
write( Stream, ':' ),
write( Stream, Local )
).
|
5.
Limitations
- The type of attributes declared in the DTD is not
introduced in the XML DOM representation.
- Internal and parameter entities are not kept in the
representation.
- The properties which involve referencing other information items are
not implemented ( [parent], [references],
[notation] and [owner element]).
This is motivated by the fact that some Prolog systems do not give
support to cyclic terms.
|
5.
Copyright
(c) Carlos Viegas Damásio
(cd@di.fct.unl.pt)
CENTRIA - Centro de Inteligência Artificial da Universidade Nova de Lisboa
This software is distributed under the GNU Library General Public License. |
|
Last update: November 9th, 2003 |