AEPs/AEP-100

AEP-100: XML Configuration Parameter Whitespace and Literalization Handling

AEP
100
Title
XML Configuration Parameter Whitespace and Literalization Handling
Version
14
Last-Modified
2010-01-06T16:22:00Z
Author
david
Status
Final
Type
Standards Track
Created
2010-01-04
Agavi-Version
1.1
Post-History

Requirements

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [IETF RFC 2119].

Abstract

This proposal specifies suggested changes to the support of non-significant whitespace and literalization of values in XML configuration parameter values. At the moment, leading and trailing whitespace in parameter values are discarded, and values are always literalized. The functionality outlined in this proposal gives full control over both of these aspects.

Motivation

So far, it is not possible to retain whitespace in a parameter value, as the underlying implementation uses AgaviToolkit::literalize() to strip leading and trailing whitespace (there's also a bug that's remotely related where an empty string is converted to NULL while several whitespace characters are converted to an empty string; should be fixed for Agavi 1.0.3, see #1203).

Likewise, it is not possible to prevent Agavi from converting a string like true or on to a boolean value, or the expansion of configuration directives like %core.environment% to their respective values (this is also done by AgaviToolkit::literalize()).

Specification

The attribute {http://www.w3.org/XML/1998/namespace}space (specified in [XML 1.0] and henceforth called xml:space) will be allowed as an attribute on {http://agavi.org/agavi/config/global/envelope/1.1}parameter (see XML Namespace Versions Change) elements, with allowed values of "default" and "preserve".

The attribute {http://agavi.org/agavi/config/global/envelope/1.1}literalize (henceforth called ae:literalize) will be allowed as an attribute on {http://agavi.org/agavi/config/global/envelope/1.1}parameter (see XML Namespace Versions Change) elements, with allowed values of "on"/"yes"/"true" (evaluates to boolean true) and "off"/"no"/"false" (evaluates to boolean false).

If the value of the xml:space attribute is "default" or the xml:space attribute is not present, then the parameter value is handled as follows:

  • leading and trailing whitespace MUST be removed, creating a resulting value for the parameter
  • a resulting value of empty string MUST be converted to NULL
  • if the value of the ae:literalize attribute evaluates to true or the ae:literalize attribute is not present
    • the resulting value MAY be converted to a different literal (currently "on"/"yes"/"true" to boolean true and "off"/"no"/"false" to boolean false, all case-insensitive)
    • configuration directive strings (e.g. "%core.environment%") MAY be expanded, as they require no exact value of the match and thus are not affected by the presence or absence of leading or trailing whitespace

If the value of the xml:space attribute is "preserve", then the parameter value is handled as follows:

  • leading and trailing whitespace MUST NOT be removed
  • an empty string MUST NOT be converted to NULL
  • if the value of the ae:literalize attribute evaluates to true or the ae:literalize attribute is not present
    • a value MUST NOT be converted to a different literal (currently "on"/"yes"/"true" to boolean true and "off"/"no"/"false" to boolean false, all case-insensitive) unless the replacement source, including potential leading or trailing whitespace, matches the value exactly
    • configuration directive strings (e.g. "%core.environment%") MAY still be expanded, as they require no exact value of the match and thus are not affected by the presence leading or trailing whitespace

Note how the conversion of an empty string to NULL is not affected by the value of the ae:literalize attribute.

Alternatives

The ae:literalize attribute may alternatively be put in no namespace, but having it in the envelope namespace enables future re-use for other configuration elements if desired.

Furthermore, an empty string may alternatively not be converted to NULL if the value of the ae:literalize attribute evaluates to boolean false, but I think empty strings should be converted regardless of the value of the ae:literalize attribute

Examples of Behavior

For the purpose of this table, the configuration directive "foo" shall contain the value "bar".

Input (XML)Result (PHP)
<ae:parameter></ae:parameter>NULL
<ae:parameter xml:space="preserve"></ae:parameter>string(0) ""
<ae:parameter> </ae:parameter>NULL
<ae:parameter xml:space="preserve"> </ae:parameter>string(1) " "
<ae:parameter>true</ae:parameter>bool(true)
<ae:parameter ae:literalize="false">true</ae:parameter>string(4) "true"
<ae:parameter xml:space="preserve">true</ae:parameter>bool(true)
<ae:parameter xml:space="preserve" ae:literalize="false">true</ae:parameter>string(4) "true"
<ae:parameter>true </ae:parameter>bool(true)
<ae:parameter xml:space="preserve">true </ae:parameter>string(5) "true "
<ae:parameter>%foo%</ae:parameter>string(3) "bar"
<ae:parameter ae:literalize="false">%foo%</ae:parameter>string(5) "%foo%"
<ae:parameter xml:space="preserve">%foo%</ae:parameter>string(3) "bar"
<ae:parameter xml:space="preserve" ae:literalize="false">%foo%</ae:parameter>string(5) "%foo%"
<ae:parameter>%foo% </ae:parameter>string(3) "bar"
<ae:parameter ae:literalize="false">%foo% </ae:parameter>string(5) "%foo%"
<ae:parameter xml:space="preserve">%foo% </ae:parameter>string(4) "bar "
<ae:parameter xml:space="preserve" ae:literalize="false">%foo% </ae:parameter>string(6) "%foo% "

XML Namespace Versions Change

These changes will incur a bump in the version number of the global/envelope XML namespace from http://agavi.org/agavi/config/global/envelope/1.0 to http://agavi.org/agavi/config/global/envelope/1.1.

As a consequence, all namespaces that reference this namespace must undergo a similar change to their version numbers. This means that all namespaces in Agavi that are used for configuration file contents will be changed.

Rationale

Control over whitespace is desirable under certain circumstances, e.g. when defining a prefix to be used for outputting something. At the same time, it's likely that people will want to define a simple string like "on" for a value. The best example for both of these situations would be AgaviSimpleTranslator, which uses a simple map defined in configuration parameters for translation.

Likewise, there will be situations where people don't want configuration directives to be expanded, at least not immediately (while simply specifying the name without enclosing it in % marks would remedy this, that's not an option if the value is a string of text containing the directive placeholder).

Normative References

IETF RFC 2119
 RFC 2119: Key words for use in RFCs to Indicate Requirement Levels, Internet Engineering Task Force, 1997.
XML 1.0
 Extensible Markup Language (XML) 1.0 (Fifth Edition), World Wide Web Consortium, 2008.

This document has been placed in the public domain.

Attachments