2. The language used to write formulas

The language used to write formulas is case insensitive.

2.1. Statements and comment lines

A formula can contain several statements, blank lines and comment lines. Example:

x = 2

# This a comment line

y = 2
= x + y
  • Spaces are allowed inside statements.

  • Statements must end with a newline.

  • All statements except last one are used to assign to local variables the values of intermediate expressions.

  • Last statement is the result of the formula. It must start with '='.

  • Some identifiers are reserved: TRUE, FALSE and all references to table cells: A1, B2, DC273, etc. Do not use reserved identifiers for the names of your local variables. Example: x=2 is OK, x1=2 is not.

2.2. Expressions

Expressions use the following operators to combine primary expressions (function calls, literals, references, etc). Operators are listed from highest priority to lowest priority.

  • -number negation like in -1.

  • number% percentage like in 10%

  • number1^number2 exponentiation: 2^3=8

  • number1*number2 multiplication. number1/number2 division.

  • number1+number2 addition. number1-number2 subtraction.

  • string1&string2 string concatenation: "black"&"white"="blackwhite".

  • value1=value2 (equal). value1<>value2 (different). number1<number2 (less than). number1<=number2 (less than or equal). number1>number2 (greater than). number1>=number2 (greater than or equal).

Unlike in other spreadsheet software, there are no union and intersection operators. Use the union() and intersection() functions. (The difference() function is also very useful.)

Unlike in other languages, there are no test, logical and, logical or and logical negation operators. Use the following functions for that: if(), and(), or(), not().[3]

Parentheses may be used to group subexpressions. Example:

= 2+2*3

means:

= 2 + (2*3)

because the priority of '*' is higher than the priority of '+'. If, in fact, you did not intended to write this, you must use parentheses:

= (2 + 2)*3

2.3. Function calls

The syntax of a function call is: function_name(argument, argument2, ...., argumentN) whatever is the locale of the computer running XMLmind XML Editor. Examples:

PI()
sin(x)
log(x, 10)
max(A1, A2, 100.0)

2.4. Literals

2.4.1. Numbers

Numbers are always written the same whatever is the locale of the computer running XMLmind XML Editor: 1, 2.3, 314E-2, 0.314e1, 0.314e+1, etc.

2.4.2. Strings

String must quoted using double quotes (") or single quotes ('). They cannot contain newline characters.

In a string quoted using double quotes, the double quote character must be escaped by doubling it. In a string quoted using single quotes, the single quote character must be escaped by doubling it.

Examples:

"It doesn't matter"
'It doesn''t matter'
"Did you say ""bizarre""?"
'Did you say "bizarre"?'

2.4.3. Booleans

TRUE and TRUE() (case insensitive of course) specify logical value true. FALSE and FALSE() specify logical value false.

2.5. References

2.5.1. Reference to a variable

There is nothing special to do to reference a variable, except that a variable needs to have been assigned a value before being referenced. Example:

x = 2
x = x + 1
= 2*x*x

2.5.2. Reference to an element having a specific ID

The syntax of this type of reference is: $(ID), where ID is the ID of a element contained in the same document as the formula. Example taken from the tutorial:

= left($(vat), len($(vat)) - 1)

This type of reference returns an XML nodeset containing a single element. If there is no element having specified ID, the reference returns the empty nodeset. More on this below.

2.5.3. Reference to table cells

The following references are said to be relative:

  • First cell of a table: A1.

  • Second row: 2:2.

  • Second column: B:B.

  • First two rows: 1:2.

  • First two columns A:B.

  • Four cells in the top/left corner of the table: A1:B2.

The following references are said to be absolute:

  • First cell of a table: $A$1.

  • Second row: $2:$2.

  • Second column: $B:$B.

  • First two rows: $1:$2.

  • First two columns $A:$B.

  • Four cells in the top/left corner of the table: $A$1:$B$2.

Mixed references are possible too: A$1, $A1, 2:$2, $2:2, etc.

What does this mean? Example: cell $A$3 (first column, third row) contains the following formula:

= A1 + A2

Because the formula uses relative references, XXE translates this to:

= $[-2,0] + $[-1,0]

which means add the cell which is 2 rows above me to the cell which is 1 row above me.

If you copy the formula of $A$3 to $B$3 (second column, third row), the copied formula will add $B$1 to $B$2 because the cell which is 2 rows above $B$3 is $B$1 and the cell which is 1 row above $B$3 is $B$2.

On the other hand, if cell $A$3 contained the following formula:

= $A$1 + $A$2

Because the formula uses absolute references, XXE would have translated this to:

= $[1,1] + $[1,2]

which means add the cell which is at first column, first row to the cell which is at first column, second row.

With absolute references, after copying the formula of $A$3 to $B$3, the copied formula would still add $A$1 to $A$2.

Important

  • Relative references are allowed only when the formula is inside a table cell.

  • References to cells, whether relative or absolute, are really possible when the XML document is displayed using a styled view.

    A reference such as $A$3 cannot be evaluated unless the formula has an ancestor element formatted as a table.

    A reference such as "Income"!$C$4 cannot be evaluated unless element with ID Income is formatted as a table.

2.5.3.1. The $[row,column] notation

This notation is the one which is internally used by XXE. You can type $[row,column] references if you want, but XXE will never show you these references as you typed them. XXE (the status bar, the formula editor, etc) will automatically display cell references using the customary A1 notation, which is much more readable.

Absolute reference examples: $A$1 = $[1,1] $2:$2 = $[2,] $B:$B = $[,2] $1:$2 = $[1,]:$[2,] $A:$B = $[,1]:$[,2] A1:B2 = $[1,1]:$[2,2]

Relative reference examples, the formula being inside cell $A$3: A1 = $[-2,0] 2:2 = $[-1,] B:B = $[,+1] 1:2 = $[-2,]:$[-1,] A:B = $[,0]:$[,+1] A1:B2 = $[-2,0]:$[-1,+1]

2.5.3.2. Reference to cells in another table of the document

The element formatted using CSS property "display:table;" must have an ID. The syntax is: "ID"!cell_reference. Examples: "Income"!$C$4, "table-23"!A:A (relative references are allowed too if the formula has itself an ancestor element formatted as a table).

2.6. XPath escapes

The language used to write formulas is very close to the one used by other spreadsheet software. The rationale is of course to make it easy to learn. But in fact, formulas are internally translated to another, less known language: XPath 1.0. XPath can be considered as the standard, native, expression language of XML.

This design allows to freely mix XPath expressions with ``ordinary'' expressions. This is what we call XPath escapes.

The syntax of XPath escapes is simply: `XPath_expression`. (The character used here is the backquote '`'.)

Inside XPath_expression, the backquote character '`' must be escaped by doubling it. Example: `concat("XXE ", " is ````challenging to learn''.")`.

An XPath expression can reference the local variables of the formula using the usual XPath syntax for that: $variable_name.

Using XPath escapes is often mandatory. Example 1: add 2 to the value of attribute count of the element containing the formula:

= `@count` + 1

Example 2 taken from the tutorial: get element with ID B2&"_desc" found in external document "products.html" (pathname relative to the file containing the document being edited).

location = (("products.html#" & trim(B2)) & "_desc")
=`document($location, .)`

This XPath-based design also allows to use XPath functions as if they were regular spreadsheet functions. In order to do this, simply add an underscore '_' at the beginning of the XPath function name and, if this name contains dashes '-', replace them by underscores. Examples:

  • _contains("Large", "e") = TRUE

  • _substring_before("Large", "e") = "Larg"

2.7. Value types

Most operators expect their operands to have a specific type and return a result having a specific type. Examples: + expects 2 numbers and returns a number, & expect two strings and returns a string.

Most functions expect their arguments to have a specific type and return a result having a specific type. Examples: left() expects a string and a number and returns a string, cos() expects a number and returns a number.

When the operand or the argument does not have the expected type, an automatic conversion is performed if this is possible; otherwise an error is reported.

Example that works: cos(B2) where cell B2 contains a string "3.14" which can be parsed as a number.

Example that sort of works: 2 + "two" (gives ``number'' NaN, which is not really a number).

Example that reports an error (because function sum() has been specified to do that): sum(2, "two").

This automatic conversion process is detailed in next section.

The types of values returned by expressions are:

Number

Examples: .23, 3.14, 314E-2, PI(), 2+2, log(A2,10).

String

Examples: "Great", 'Not that great', "black" & "white", mid(D3, 3, 4).

Boolean

Examples: TRUE, FALSE, true(), false(), 1>=0, and(A1 >= 1, A1 <= 99).

Nodeset

A set of XML nodes: Examples: A1, A:A, $2:$3, "table-23"!$A$1, $(vat), `id("product-list")/*`, apply("x + 1", B:B).

Date

Year/month/day. Examples: today(), date(1960,3,16), datevalue("1960-03-16"), datevalue("3/16/60", "MM/dd/yy").

Time

Hour/minute/seconds/fraction of a second. Examples: time(13,30,45), timevalue("13:30:45"), timevalue("1:30:45 PM", "hh:mm:ss a").

DateTime

Year/month/day and hour/minute/seconds/fraction of a second: Examples: now(), datetime(1960,3,16,13,30,45), datetimevalue("1960-03-16T13:30:45"), datetimevalue("1:30:45.250 PM 3/16/60", "hh:mm:ss.SSS a MM/dd/yy").

2.8. Automatic conversion between different value types

The behavior of operators such as +, *, =, <>, <=, >, etc, is the one of equivalent XPath operators.

Surprising example: using = and <> to compare node-sets:

1two
two3

(A:A = B:B), (A:A = B1) and (A:A <> B2) all work and evaluate to true.

Fortunately the behavior of spreadsheet function such as sum(), rounddown(), etc, is almost identical to the behavior of similar functions found in other spreadsheet software. This behavior has no XPath equivalent.

The automatic conversion process is the one described in the XPath standard.

From\ToNumberStringBooleanNodesetDate / Time / DateTime
Number-

Conversion does not add superfluous zeros after the point (e.g. it generates 1, not 1.0).

Scientific notation (e.g. 314E-2) is never used.

0 and NaN are converted to FALSE. Other numbers are converted to TRUE.

ERROR

A number is taken to be the number of seconds since January 1, 1970, 00:00:00 GMT. This number is converted to the corresponding datetime.

String

Strings such as 3.14, -2 can be parsed as numbers.

Strings such as 314E-2 or 1,000,000.0 cannot be parsed as numbers (use function numbervalue() to do this).

A string which cannot be parsed as a number is converted to NaN.

-

A string is TRUE if its length is non-zero.

ERROR

Strings using the ISO 8601 format (also used by W3C XML Schema Datatypes) are successfully converted to date, time and datetime.

Examples: 1960-03-15Z (date), 13:30:00Z (time), 1960-03-16T12:30:00Z (datetime)

Boolean

TRUE is converted to 1. FALSE is converted to 0.

TRUE is converted to "true". FALSE is converted to "false".

-ERRORERROR
Nodeset

A nodeset is first converted to a string and then this string is converted to a boolean.

String value of the node in the nodeset that is first in document order.

Text contained in descendant nodes of this first node is taken into account. Except that text contained in comments and processing-instructions is ignored.

Example: <ul><li>The <b>little</a></li><li> <!--pussy-->cat </li><li>is chasing a mouse.</li></ul> converted to a string gives "The little cat is chasing a mouse.".

A nodeset is TRUE if it is non empty.

-ERROR
Date / Time / DateTime

Date, time and datetime are converted to the number of seconds since January 1, 1970, 00:00:00 GMT.

(The ``date used for a time'' is January 1, 1970 GMT.)

Date, time and datetime are represented using the ISO 8601 format (also used by W3C XML Schema Datatypes).

Examples: 1960-03-15Z (date), 13:30:00Z (time), 1960-03-16T12:30:00Z (datetime)

Date, time and datetime are always TRUE.

ERROR-


[3] The documentation of functions is exclusively available online. Use the Paste Function tab of the Formula Editor to browse it.