Apache > ManifoldCF > Release Documentation
 

ManifoldCF Scripting Language

Overview

The ManifoldCF scripting language allows symbolic communication with the ManifoldCF API Service in order to define connections and jobs, and perform crawls. The language provides support for JSON-like hierarchical documents, as well as the ability to construct properly encoded REST URLs. It also has support for simple control flow and error handling.

How to use the script interpreter

The ManifoldCF script interpreter can be used in two ways - either as a real-time shell (executing a script as it is typed), or interpreting a script file. The main class of the interpreter is org.apache.manifoldcf.scriptengine.ScriptParser, and the two ways of invoking it are:

java -cp ... org.apache.manifoldcf.scriptengine.ScriptParser
      

or:

java -cp ... org.apache.manifoldcf.scriptengine.ScriptParser <script_file> <arg1> ... <argN>
      

If you choose to invoke ScriptParser in interactive mode, simply type your script one line at a time. Any errors will be reported immediately, and the ScriptParser will accordingly exit. You can also type ^Z to terminate the script.

If you use ScriptParser with a scripting file, that file will be read and interpreted. The arguments you provide will be loaded into an array of strings, which is accessible from your script as the variable named __args__.

Running the script interpreter by hand

When you build ManifoldCF, the required dependent jars for the scripting language are copied to dist/script-engine/lib. You can run the interpreter in interactive mode by typing:

cd dist\script-engine
run-script.bat <args>
        

Or, on Linux:

cd dist/script-engine
run-script.sh <args>
        

You will need to set the environment variable ENGINE_HOME to point at the dist/script-engine directory beforehand, so that the scripts can locate the appropriate jars.

Running the script interpreter using Ant

You can also start the script interpreter with all the correct required jars using Ant. Simply type the following:

ant run-script-interpreter
        

This will start the script interpreter in interactive mode only.

Running the script interpreter using Maven

You can also run the script interpreter using maven. The commands are:

cd framework/script-engine
mvn exec:exec
        

This, once again, will start the interpreter in interactive mode.

Script language syntax

A ManifoldCF script is not sensitive to whitespace or indenting. All comments begin with a '#' character and end with the end of that line. Unquoted tokens can include alphanumeric characters, plus '_', '$', and '@'. Numeric tokens always begin with a number ('0'-'9'), and are considered floating-point if they include a decimal point ('.'). Otherwise they are integers. String tokens can be quoted with either a double quote ('"') or a single quote, and within strings characters can be escaped with a preceding backslash ('\').

A ManifoldCF script has a syntax that is readily described with a BNF grammar. See below.

program:
--> statements
  
statements:
--> statement1 ... statementN

statement:
--> 'set' expression '=' expression ';'
--> 'print' expression ';'
--> 'if' expression 'then' statements ['else' statements] ';'
--> 'while' expression 'do' statements ';'
--> 'break' ';'
--> 'error' expression ';'
--> 'insert' expression 'into' expression ['at' expression] ';'
--> 'remove' expression 'from' expression ';'
--> 'wait' expression ';'
--> 'GET' expression '=' expression ';'
--> 'PUT' expression '=' expression 'to' expression ';'
--> 'POST' expression '=' expression 'to' expression ';'
--> 'DELETE' expression ';'

expression:
--> '(' expression ')'
--> expression '&&' expression
--> expression '||' expression
--> '!' expression
--> expression '&' expression
--> expression '|' expression
--> expression '==' expression
--> expression '!=' expression
--> expression '>=' expression
--> expression '<=' expression
--> expression '>' expression
--> expression '<' expression
--> expression '+' expression
--> expression '-' expression
--> expression '*' expression
--> expression '/' expression
--> '-' expression
--> '[' [expression [',' expression ...]] ']'
--> '{' [expression [',' expression ...]] '}'
--> '<<' expression ':' expression ':' [expression '=' expression [',' expression '=' expression ...]] ':' [expression [',' expression ...]] '>>'
--> expression '[' expression ']'
--> expression '.' token
--> token
--> string
--> number
--> 'true' | 'false'
--> 'null'
--> 'new' newexpression
--> 'isnull' expression

newexpression:
--> 'url' expression
--> 'connectionname' expression
--> 'configuration'
--> 'configurationnode' expression
--> 'array'
--> 'dictionary'
--> 'queryarg' expression ['=' expression] 

      

Script language variables

Variables in the ManifoldCF scripting language determine the behavior of all aspects of expression evaluation, with the exception of operator precedence. In particular, every canonical variable has the ability to support arbitrary attributes (which are named properties of the variable), subscripts (children which are accessed by a numeric subscript), and all other operations, such as '+' or '=='. Not all kinds of variable instance will in fact support all such features. Should you try to use a feature with a variable that does not support it, you will receive a ScriptException telling you what you did wrong.

Since the actual operation details are bound to the variable, for binary operations the left-hand variable typically determines what actually takes place. For example:

print 3+7;
     [java] 10
print "3"+7;
     [java] 37
      

There is, of course, a way to caste a variable to a different type. For example:

print "3".__int__+7;
     [java] 10
      

Here, we are using the built-in attribute __int__ to obtain the integer equivalent of the original string variable "3". See the following table for a list of some of the standard attributes and their meanings:

Standard attributes
Attribute nameMeaning
__script__Returns the script code that would create this variable
__string__Returns the string value of the variable, if any
__int__Returns the integer value of the variable, if any
__float__Returns the floating-point value of the variable, if any
__boolean__Returns the boolean value of the variable, if any
__size__Typically returns the number of subscript children
__type__Returns the 'type' of the variable
__value__Returns the 'value' of the variable
__dict__Returns a dictionary equivalent of the variable
__OK__Returns a boolean 'true' if the variable was "OK", false otherwise
__NOTFOUND__Returns a boolean 'true' if the variable was "NOTFOUND", false otherwise
__CREATED__Returns a boolean 'true' if the variable was "CREATED", false otherwise

Obviously, only some variables will support each of the standard attributes. You will receive a script exception if you try to obtain a non-existent attribute for a variable.

Integers

Integer variable types are created by non-quoted numeric values that do not have a '.' in them. For example, the character '4' will create an integer variable type with a value of 4.

The operations supported for this variable type, and their meanings, are listed in the table below:

Integer operations
OperationMeaningExample
binary +Addition, yielding an integer4+7
binary -Subtraction, yielding an integer7-4
binary *Multiplication, yielding an integer7*4
binary /Division, yielding an integer7/4
unary -Negation, yielding an integer-4
binary ==Equality comparison, yielding a boolean7 == 4
binary !=Inequality comparison, yielding a boolean7 != 4
binary >=Greater or equals comparison, yielding a boolean7 >= 4
binary <=Less or equals comparison, yielding a boolean7 <= 4
binary >Greater comparison, yielding a boolean7 > 4
binary <Less comparison, yielding a boolean7 < 4
binary &Bitwise AND, yielding an integer7 & 4
binary |Bitwise OR, yielding an integer7 | 4
unary !Bitwise NOT, yielding an integer! 7

In addition, the standard attributes __script__, __string__, __int__, and __float__ are supported by integer types.

Strings

String variable types are created by quoted sequences of characters. For example, the character '"hello world"' will create a string variable type with an (unquoted) value of "hello world".

The operations supported for this variable type, and their meanings, are listed in the table below:

String operations
OperationMeaningExample
binary +Concatenation, yielding a string"hi" + "there"
binary ==Equality comparison, yielding a boolean"hi" == "there"
binary !=Inequality comparison, yielding a boolean"hi" != "there"

In addition, the standard attributes __script__, __string__, __int__, and __float__ are supported by string types.

Floating-point numbers

Float variable types are created by non-quoted numeric values that have a '.' in them. For example, the token '4.1' will create a float variable type with a value of 4.1

The operations supported for this variable type, and their meanings, are listed in the table below:

Float operations
OperationMeaningExample
binary +Addition, yielding a float4.0+7.0
binary -Subtraction, yielding a float7.0-4.0
binary *Multiplication, yielding a float7.0*4.0
binary /Division, yielding a float7.0/4.0
unary -Negation, yielding a float-4.0
binary ==Equality comparison, yielding a boolean7.0 == 4.0
binary !=Inequality comparison, yielding a boolean7.0 != 4.0
binary >=Greater or equals comparison, yielding a boolean7.0 >= 4.0
binary <=Less or equals comparison, yielding a boolean7.0 <= 4.0
binary >Greater comparison, yielding a boolean7.0 > 4.0
binary <Less comparison, yielding a boolean7.0 < 4.0

In addition, the standard attributes __script__, __string__, __int__, and __float__ are supported by float types.

Booleans

Boolean variable types are created by the keywords true and false. For example, the code 'true' will create a boolean variable type with a value of "true".

The operations supported for this variable type, and their meanings, are listed in the table below:

Boolean operations
OperationMeaningExample
binary ==Equality comparison, yielding a boolean7.0 == 4.0
binary !=Inequality comparison, yielding a boolean7.0 != 4.0
binary &&AND logical operation, yielding a booleantrue && false
binary ||OR logical operation, yielding a booleantrue || false
binary &AND logical operation, yielding a booleantrue & false
binary |OR logical operation, yielding a booleantrue | false
unary !NOT logical operation, yielding a boolean! true

In addition, the standard attributes __script__ and __boolean__ are supported by boolean types.

Arrays

Array variable types are created by an initializer of the form [ [expression [, expression ...]] ]. For example, the script code '[3, 4]' will create an array variable type with two values, the integer "3" and the integer "4".

The operations supported for this variable type, and their meanings, are listed in the table below:

Array operations
OperationMeaningExample
subscript []Find the specified subscript variable, yielding the variable[3,4] [0]

In addition, the standard attributes __script__ and __size__ are supported by array types, as well as the insert and remove statements.

Dictionaries

Dictionary variable types are created using the "new" operator, e.g. new dictionary.

The operations supported for this variable type, and their meanings, are listed in the table below:

Array operations
OperationMeaningExample
subscript []Find the specified key, yielding the keyed variablemydict ["keyname"]

In addition, the standard attributes __script__ and __size__ are supported by dictionary types.

Configurations

Configuration variables contain the equivalent of the JSON used to communicate with the ManifoldCF API. They can be created using an initializer of the form { [expression [, expression ...]] }. For example, the script code '{ << "outputconnector" : "" : : , << "description" : "Solr" : : >>, << "class_name" : "org.apache.manifoldcf.agents.output.solr.SolrConnector" : : >> >> }' would create a configuration variable equivalent to one that might be returned from the ManifoldCF API if it was queried for the output connectors registered by the system.

The operations supported for this variable type, and their meanings are listed in the table below:

Configuration operations
OperationMeaningExample
subscript []Find the specified child configuration node variable, yielding the variablemyconfig [0]
binary +Append a configuration child node variable to the listmyconfig + << "something" : "somethingvalue" : : >>

In addition, the standard attributes __script__, __dict__, and __size__ are supported by configuration variable types, as well as the insert and remove statements.

Configuration nodes

Configuration node variable types are children of configuration variable types or configuration node variable types. They have several components, as listed below:

  • A type
  • A value
  • Attributes, described as a set of name/value pairs
  • Children, which must be configuration node variable types

Configuration node variable types can be created using an initializer of the form << expression : expression : [expression = expression [, expression = expression ...]] : [expression [, expression ... ]] '>>'. The first expression represents the type of the node. The second is the node's value. The series of '=' expressions represents attribute names and values. The last series represents the children of the node. For example, the script code '<< "description" : "Solr" : : >>' represents a node of type 'description' with a value of 'Solr', with no attributes or children.

The operations supported for this variable type, and their meanings are listed in the table below:

Configuration node operations
OperationMeaningExample
subscript []Find the specified child configuration node variable, yielding the variablemyconfig [0]
binary +Append a configuration child node variable to the listmyconfig + << "something" : "somethingvalue" : : >>

In addition, the standard attributes __script__, __string__, __size__, __type__, __dict__ and __value__ are supported by configuration node variable types, as well as the insert and remove statements.

URLs

URL variable types exist to take care of the details of URL encoding while assembling the REST URL's needed to describe objects in ManifoldCF's REST API. A URL variable type can be created using a 'new' operation of the form new url expression, where the expression is the already-encoded root path. For example, the script code 'new url "http://localhost:8345/mcf-api-service/json"' would create a URL variable type with the root path "http://localhost:8345/mcf-api-service/json".

The operations supported for this variable type, and their meanings are listed in the table below:

URL operations
OperationMeaningExample
binary ==Equals comparison, yielding a booleanurl1 == url2
binary !=Non-equals comparison, yielding a booleanurl1 != url2
binary +Append and encode another path or query argument element, yielding a URLurl1 + "repositoryconnections"

In addition, the standard attributes __script__ and __string__ are supported by URL variable types.

Query Arguments

Query Argument variable types exist to take care of the details of URL encoding while assembling the query arguments of a REST URL for ManifoldCF's REST API. A Query Argument variable type can be created using a 'new' operation of the form new queryarg expression [= expression], where the first expression is the query argument name, and the second optional expression is the query argument value. For example, the script code 'new queryarg "report" = "simple"' would create a Query Argument variable type representing the query argument "report=simple". To add query arguments to a URL, simply add them using the '+' operator, for example "urlvar = urlvar + new queryarg 'report' = 'simple';" .

The operations supported for this variable type, and their meanings are listed in the table below:

Query Argument operations
OperationMeaningExample
binary ==Equals comparison, yielding a booleanarg1 == arg2
binary !=Non-equals comparison, yielding a booleanarg1 != arg2

In addition, the standard attributes __script__ and __string__ are supported by Query Argument variable types.

Connection names

Connection name variable types exist to perform the extra URL encoding needed for ManifoldCF's REST API. Connection names must be specially encoded so that they do not contain slash characters ('/'). Connection name variable types take care of this encoding.

You can create a connection name variable type using the following syntax: new connectionname expression, where the expression is the name of the connection.

The operations supported for this variable type, and their meanings are listed in the table below:

URL operations
OperationMeaningExample
binary ==Equals comparison, yielding a booleancn1 == cn2
binary !=Non-equals comparison, yielding a booleancn1 != cn2

In addition, the standard attributes __script__ and __string__ are supported by connection name variable types.

Results

Result variable types capture the result of a GET, PUT, POST, or DELETE statement. They consist of two parts:

  • A result code
  • A result configuration value

There is no way to directly create a result variable type, nor does it support any operations. However, the standard attributes __script__, __string__, __value__, __OK__, __NOTFOUND__, and __CREATED__ are all supported by result variable types.

Statements

The statements available to a ManifoldCF script programmer are designed to support interaction with the ManifoldCF API. Thus, there is support for all four HTTP verbs, as well as basic variable setting and control flow. The table below describes each statement type:

Statement types
StatementMeaningExample
'set' expression '=' expression ';'Sets the variable described by the first expression with the value computed for the secondset myvar = 4 + 5;
'print' expression ';'Prints the string value of the expression to stdoutprint "hello world";
'if' expression 'then' statements ['else' statements] ';'If the boolean value of the expression is 'true', executes the first set of statements, otherwise executes the (optional) second setif true then print "hello"; else print "there"; ;
'while' expression 'do' statements ';'While expression is true, execute the specified statements, and repeatwhile count > 0 do set count = count - 1; ;
'break' ';'Exits from the nearest enclosing while loopwhile true do break; ;
'error' expression ';'Aborts the script with a script exception based on the string value of the expressionerror "bad stuff";
'wait' expression ';'Waits the number of milliseconds corresponding to the integer value of the expressionwait 1000;
'insert' expression 'into' expression ['at' expression] ';'Inserts the first expression into the second variable expression, either at the end or optionally at the position specified by the third expressioninsert 4 into myarray at 0 ;
'delete' expression 'from' expression ';'Deletes the element described by the first expression from the second expressiondelete 0 from myarray ;
'GET' expression '=' expression ';'Perform an HTTP GET from the URL specified in the second expression capturing the result in the first expressionGET result = new url "http://localhost:8345/mcf-api-service/json/repositoryconnections" ;
'DELETE' expression '=' expression ';'Perform an HTTP DELETE on the URL specified in the second expression capturing the result in the first expressionDELETE result = myurl ;
'PUT' expression '=' expression 'to' expression ';'Perform an HTTP PUT of the second expression to the URL specified in the third expression capturing the result in the first expressionPUT result = configurationObject to myurl ;
'POST' expression '=' expression 'to' expression ';'Perform an HTTP POST of the second expression to the URL specified in the third expression capturing the result in the first expressionPOST result = configurationObject to myurl ;