Table of Contents
Biferno provides a rich variety of string manipulation functionality. In
this chapter we analyze the most significant methods implemented by the
string and ansi classes. A detailed list of all methods and properties of
Biferno predefined classes is contained in "Biferno: Reference Guide".
Comparing two string means to compare one by one their characters by starting from the first (the leftmost), until either two different characters are found, or the end of one of the two strings is reached.
Two strings are considered equal if they consist of the same characters in the same sequence. If two different characters are found in a corresponding position, the relationship between these two characters determines the relationship between the strings.
A character is considered "less" than another character if it precedes it in the ASCII character table. A string is considered "less" than another if the first different character is "less" than the corresponding character in the other string. If the first characters of the longest string are exactly the same as the shortest string, the longest string is always considered greater. Notice that the result of character comparison for characters with ASCII code greater than 128 can depend on the operating system. These characters can assume different values on different systems.
The simplest method to compare two Biferno strings is to use logical
operators. Using logical operators the comparison is case sensitive, i.e.
the two strings "sun" and "Sun" are considered different.
An alternative is to use the Compare method of the string class, with
prototype:
int Compare(string str, boolean caseSense = false)
This method takes as parameters the string to compare to and a Boolean
value to indicate if the comparison should be case sensitive or not (the
default is false). The method returns an integer value, which can be 0
(zero), if the two strings are equal, 1, if the string to be compared is
greater, or -1 if the string to be compared is smaller. An example is:
<?
str1 = "a"
str2 = "b"
$str1.Compare(str2) // This instruction prints the value 1
str1 = "sun"
str2 = "Sun"
$str1.Compare(str2) // This instruction prints the value 0
$str1.Compare(str2, true) // This instruction prints the value -1
?>
A third way of comparing two strings is to use the strcmp method of the
ansi class:
static int strcmp(string str1, string str2)
This method is called statically and takes as parameters the two strings
to be compared. The result of the comparison is an integer value, which
can be 0 (zero), if the two strings are equal, a positive value if the
first string is greater than the second, or a negative value if the first
string is smaller than the second. If the result is non-zero, its value is
the difference between the ASCII codes of the first two characters that
differ between the strings.
In the following example the two strings str1 and str2 differ starting
from the second character. The strcmp returns the value -14, which
indicates that str1 is smaller than str2, and is the difference between
the ASCII code of the a character (97) in str1 and the ASCII code of the
corresponding character (o, code 111) in str2.
<?
str1 = "salty"
str2 = "solitary"
$ansi.strcmp(str1, str2) // The instruction prints the value -14
?>
The comparisons executed by methods of the ansi class are always case
sensitive.
It is sometimes necessary to compare a string at the same time with several
different strings. This is the same as asking if the string is contained in
a given string set. To this end the string class provides the In method
with prototype:
boolean In(string str, char sep = ",")
The string set is passed as a single string in which the individual strings of the set are separated by a special character, called a separator. The default separator is the comma. An example is:
<?
if (user.GetUsername().In("john,paul,george,ringo"))
print("<p>Welcome " + user.firstName + "</p>\n")
else
print("<p>Unknown user. Access denied.</p>\n")
?>
The wildcard character * (star) can be used in the string set. In the
following example the In method returns the true value if the keyword
string either starts by "comp", or is "software", or is "hardware".
<?
if (keyword.In("comp*,software,hardware"))
category = 1
?>
To establish if a string contains another string we use the Contains
method of the string class, with prototype:
boolean Contains(string str, boolean caseSense = false)
This method returns true if the str string is contained in the string that the method is
applied to, or false otherwise.
<?
str = client.userAgent
if (str.Contains("MSIE", true))
print("Your browser is Internet Explorer")
?>
Variants of this method are:
ContainsWordBegin, that determines if the string that the method
is applied to contains a word beginning with the string passed as
parameter. A word is defined as a group of contiguous characters
delimited by spaces or other separators (punctuation marks, tabs,
etc.).
ContainsWordEnd, that determines if the string that the method is
applied to contains a word ending with the string passed as
parameter.
ContainsWordExact, that determines if the string that the method
is applied to contains a word matching exactly the string passed as
parameter.
The ansi class provides the strstr method with prototype:
static string strstr(string str1, string str2)
This method searches for str2 in str1 and, if str2 is contained in str1,
returns a string containing the substring of str1 that starts with str2
and reaches up to the last character of str1. In the following example the
string str2 is "MSIE 5.0; Mac_PowerPC)":
<?
str1 = "Mozilla/4.0 (compatible; MSIE 5.0; Mac_PowerPC)"
str2 = ansi.strstr(str1, "MSIE")
?>
If we need to check if a string starts or ends with another string, we can
use the Begins and Ends methods of the string class, with prototypes:
boolean Begins(string str, boolean caseSense = false)
boolean Ends(string str, boolean caseSense = false)
It can be sometimes useful to know the position of the first character of
a substring. We can use the Find method of the string class with
prototype:
int Find(string str, boolean caseSense = false, int from)
The Find method returns the position of the first instance of the str
string in the string the method is applied to, starting from the character
position specified by the from parameter (the default is to start from the
first character of the string); if the str string is not found the value 0
(zero) is returned.
<?
str = "Biferno is an object oriented language"
pos = str.Find("Biferno") // pos is 1
pos = str.Find("lang") // pos is 31
pos = str.Find("x") // pos is 0
?>
The IsEMail method of the string class returns true if the string has the
format of a valid email address.
<?
userEmail = "john@domain.com"
check_email = userEmail.IsEMail() // The method returns true
?>
The IsEmail method has the following prototype:
boolean IsEMail(boolean exists, string *msg)
If the exists parameter is true and the string is is a legally formatted email
address, the IsEmail method contacts the mail exchange host for the
corresponding email domain and verifies if a user corresponding to the supplied
email address exists. The msg parameter will return an error message from
the smtp server, if there is one.
Contacting an smtp server over the Internet to verify an email address is a slow operation and may take up to several seconds. On the other hand verification can be useful to avoid sending email to non existing addresses or to check that a user did supply an existing email address to a subscription form.
The IsDate method of the string class returns true if the string has the
format of a valid date. The method has prototype:
boolean IsDate(string format)
The format parameter specifies the expected order of day, month, year (the default is the one defined by the application variable DATE_FORMAT). A
dash (-), a forward slash (/), or the separator defined in DATE_FORMAT can
be used as separators:
<?
aDate = "24-10-2001"
check_date = aDate.IsDate("d-m-y") // The method returns true
check_date = aDate.IsDate("y-m-d") // The method returns false
?>
The IsNumeric method of the string class returns true if the string has
the format of a valid number. The method has prototype:
boolean IsNumeric(void)
The methods recognizes the thousand separator (as defined by the
application variable application THOUSAND_SEP or by the curScript.SetNumFormat method)
and the decimal separator (as defined by the application variable
application DECIMAL_SEP or by the curScript.SetNumFormat method).
<?
/* Assume:
thousand separator = '.'
decimal separator = ','
*/
aNum = "30.000"
check_num = aNum.IsNumeric() // The method returns true
aNum = "300.00"
check_num = aNum.IsNumeric() // The method returns false
?>
The methods of the string class described in this paragraph transform a
string into another string, e.g. by eliminating some characters, extracting
substrings, or replacing a group of characters with another. Remember that
it is always possible to access the individual characters of a string (for
reading or writing) using the char property, which contains the string
representation as an array of characters.
The Hilite method searches for one or more strings in a text and inserts
a given string before and after each occurrence of the string(s). This
method has prototype:
string Hilite(boolean cs, boolean skipHTML, string pre, string post, obj strN...)
The cs parameter determines if the search should be case sensitive. The
skipHTML parameter can be used to exclude from the search text areas that
are HTML tags. The pre and post parameters provide the strings to be
inserted before and after each occurrence of the string. The strN
parameter provides one or more strings (or array of strings, or search,
see Chapter 15, Database Interaction) to highlight. If an array is passed, all strings in the
array are highlighted. Simple strings and strings arrays can be mixed.
<?
str = "Welcome to the Biferno user manual"
str = str.Hilite(false, true, "<b>", "</b>", "manual", "Biferno")
print(str)
?>
This example generates the following HTML code:
Welcome to the <?b?>Biferno<?/b?> user <?b?>manual<?/b?>
Which produces the following output:
Welcome to the Biferno user manual
The same example can also be written as:
<?
str = " Welcome to the Biferno user manual"
arrHilite = array("manual", "Biferno")
str = str.Hilite(false, true, "<b>", "</b>", arrHilite)
print(str)
?>
The pre and post strings can contain some special symbols that are replaced with
the string str to be highlighted:
The ** characters are replaced by str.
The $$ characters are replaced by str coded by the UrlEncode
method (see the following section on string encoding).
The ## characters are replaced by str with ISO Latin encoding (see
the following section on string encoding).
<?
str = " Welcome to the Biferno user manual "
str = str.Hilite(false, true, "<a href=\"http://www.tabasoft.it/**/\">",
"</a>", "Biferno")
print(str)
?>
This example generates the following HTML code:
Welcome to the
<a href="http://www.tabasoft.it/biferno/">Biferno</a> user manual
Which produces the following output:
Welcome to the Biferno user manual
If a variable of the search class is passed to Hilite, all substrings contained in the
search are highlighted (see Chapter 15, Database Interaction).
The LowToUpper and UpToLower methods operate on the character case,
transforming lowercase characters into uppercase characters, and vice
versa. These methods have prototypes:
string LowToUpper(int from, int len)
string UpToLower(int from, int len)
The from parameter indicates the position of the first character in the
string that should be converted (the default value is 1). The len
parameter limits the number of characters to convert. The default is to
convert all characters starting from the position indicated by the from
parameter until the end of the string.
<?
str = "biferno"
str = str.LowToUpper()
print(str + "<br>\n")
str = str.UpToLower(3, 3)
print(str + "<br>\n")
str = str.UpToLower("len":2)
print(str + "<br>\n")
str = str.UpToLower(6)
print(str + "<br>\n")
?>
The example above produces the following result:
BIFERNO BIferNO biferNO biferno
The Capitalize method transforms into uppercase the first character of all
words contained in a string, where a word is a group of characters
delimited by spaces, line breaks, tabulators, punctuation marks, quotes,
parentheses, etc.
<?
text = " The PEN is mightier than the sword"
text = text.Capitalize()
// text is "The Pen Is Mightier Than The Sword"
?>
Notice that the Capitalize method acts on all characters of a word, not just on the first
character.
The SubString method returns a substring of any length of the given
string, starting from a given character index. The method has prototype:
string SubString(int from, int len)
The int and len parameters have a meaning similar to the one described for
the LowToUpper and UpToLower methods, i.e. the method extracts len
characters starting from the position indicated by the from parameter. If
the latter is omitted, all characters until the end of the string are
extracted.
<?
str = "This is a text"
$str.SubString(1, 4) // prints "This"
$str.SubString(6) // prints "is a text"
$str.SubString(6, 2) // prints "is"
?>
The InsertSubString method allows to insert a string within another string starting from
a given character index. The method has prototype:
string InsertSubString(int pos, string subString)
E.g.:
<?
str = "What a day!"
str = str.InsertSubString(8, "nice ") // str is " What a nice
day!"
?>
The RemoveSubString method removes a certain number of characters from a
string starting from a given position and has prototype:
string RemoveSubString(int from, int len)
This method has the same parameters of the SubString method and returns
the string obtained by removing len characters from the original string
starting from the position with index from, as in:
<?
str = "What a nice day!"
str = str.RemoveSubString(8, 5) // str is "What a day!"
?>
The Substitute method replaces a substring of the given string with
another and has prototype:
string Substitute(string oldString, string newString,
boolean cs = false, boolean skipHTML = false)
The cs and skipHTML parameters have the same meaning as in the Hilite
method. The oldString and newString parameters are, respectively, the
substring to search for and the string to be substituted.
<?
str = "john,paul,george,ringo"
str = str.Substitute(",","+") // str is "john+paul+george+ringo"
?>
The ToArray method converts a string into an array of strings. The array
elements are substrings of the original string which are extracted if
delimited by the specified separator. The original string is left
unmodified. This method has prototype:
array ToArray(string separator=", ")
The default separator value is ", ", i.e. a comma followed by a space. In
the following example the ToArray method applied to the str string
generates an array of four elements containing the strings "john", "paul",
"george", "ringo" in this order.
<?
str = "john,paul,george,ringo"
myArray = str.ToArray(",")
$myArray[2] // prints the string "paul"
?>
The Pad method is used to add a certain number of repetitions of a given
character in front or at the end of a string, until a predefined length is
reached. This method has the following prototype:
string Pad(int totChars, char padChar, boolean before = false)
An example is:
<?
str = "123"
str = str.Pad(8, "0", true) // str is "00000123"
str = "123"
str = str.Pad(6, "*") // str is "123***"
?>
The Encode method encodes a string according to the ISO 8859-1 format
(Latin-1). More precisely, Biferno uses the ANSI character set
(Windows-1252), which is an extension of the ISO 8859-1 set. E.g., the
À character is encoded as à (ampersand + hash + decimal
numeric code of the character + semicolon). This encoding is necessary to
be able to output special characters on a Web page, such as vowels with
accents, and symbols that might otherwise not be visualized correctly by
the browser (this is a potential issue on the MacOS platform).
The Encode method has the following prototype:
string Encode(boolean alsoCR=false, boolean tagsVisible=false,
boolean entities=false, obj tagList)
The alsoCR parameter specifies if we want to encode new line characters
with <br> tags (default: no). The tagsVisible parameters indicates
if we want to encode the < and > that delimit HTML tags with
< and > (default: no). This is useful if one wants to
visualize a fragment of HTML code within a Web page avoiding
interpretation by the browser. The < and > characters are always
encoded when they are not part of an HTML tag. The extTags parameter
determines what constitutes a tag. When tagList is true, all words
introduced by the < character are valid tags (as in XML). When extTags
is false, only HTML tags (such as <b>, <body>, etc.) are
considered tags. If an associative array is passed as the value of the
extTags parameter, the names of the elements of the associative array
determine what is considered a tag. Notice that the actual values of the
elements of the associative array are ignored. Finally, the entities
parameters specifies if we want that encoding is performed using the
so-called HTML entities instead of numerical codes (e.g. the à
character is encoded as à and not as &224;).
In the following example we demonstrate how the effect of the Encode method changes by changing the values of these parameters:
<?
str = "<p>Letters with accents: à, è, ì, ò, ù\nSpecial characters:
\", &, ©, ℗</p>"
$str.Encode() + "<br>\n"
$str.Encode(true) + "<br>\n"
$str.Encode(false, true)
?>
This example generates the following HTML code:
<p>Letters with accents: à, è, ì, ò, ù
Special characters: ", &, ©, ®</p><br>
<p>Letters with accents: à, è, ì, ò, ù
<br>Special characters: ", &, ©, ®</p><br>
<p>Letters with accents: à, è, ì, ò,ù
Special characters: ", &, ©, ®</p>
which in turn generates the following output:
Letters with accents: à, è, ì, ò, ù Special characters: ", &, ©, ℗
Letters with accents: à, è, ì, ò, ù
Special characters: ", &, ©, ℗
<p> Letters with accents: à, è, ì, ò, ù Special characters: ", &, ©, ®<p>
The Decode method allows to decode a string encoded in ISO 8859-1 format
and has the following prototype:
string Decode(boolean alsoCR = false)
The alsoCR parameter indicates if we want to convert <br> tags into
new line characters (default: no).
A URL, acronym of Uniform Resource Locator, is a convention to describe a
resource available on the Internet. The UrlEncode method applies to a
string the encoding rules used in constructing a URL. All non-alphanumeric
special characters and spaces are replaced by the percent character (%)
followed by a two-digit hexadecimal code (e.g. the à character is
replaced by %E0). This encoding is necessary to avoid misinterpretation of
special characters in a URL during network transmission.
This method has the following prototype:
string UrlEncode(boolean spaceToPlus, string pre)
The spaceToPlus parameter specifies if we want to replace space characters
with %20 (standard encoding) or with the + character (default: no). The
pre parameter provides the string to use in front of hexadecimal character
codes (default: "%").
Let's see a couple of examples of use of the UrlEncode method.
<?
str = "This string contains the special characters: / $ & %"
$str.UrlEncode()
?>
This code fragment produces the following output:
This%20string%20contains%20the%20special%20characters%3A%20%2F%20%24%20%26%20%25
It is often useful to encode a string in order to pass it a parameter in a URL by using UrlEncode:
<?
city = "Mexico City"
url = "http://www.xyz-travels.com/search.bfr?dest=" + city.UrlEncode()
?>
In this example the city string is converted to : "Mexico%20City".
The UrlDecode method allows to decode a URL-encoded string and has
prototype:
string UrlDecode(boolean plusToSpace, string pre)
The plusToSpace parameter specifies if we want to replace + characters in
the string with spaces (default: no). The pre parameter specifies the
string in front of hexadecimal character codes in the text to be decoded
(default: "%").
The pre parameter can be used to execute other types of encoding. E.g.
some Javascript calls require an encoding with a "\x" prefix string, as in
"\x2E" (notice that the backslash character must be escaped using another
backslash character in Biferno to avoid its interpretation as a special
character), and sometimes in emails the text must be coded using a "="
prefix string, as in "=2E".
We have described implicit conversion methods that allow automatic typecast from numbers (integers or with a decimal part) into strings. This kind of conversion is performed using an internal default format for numerical strings.
When it is necessary to convert a number into a string using a format other
than the default, we can use the ToString method of the primitive numeric
classes (int, long, double, ecc.), with prototype:
string ToString(boolean wantThousandSep = false, int decimals = 2,
boolean cutRightZero = true)
It is possible to specify if we want a thousand separator (default: no),
the number of digits after the comma (default: 2), and if the decimal part
should be padded with zeroes to reach the required length (default: no).
The ToString method should not be confused with the tostring method (see
Chapter 11, User classes), which allows automatic string
conversion for a user class (remember that identifiers in Biferno are case
sensitive).
The curScript.SetNumFormat static method allows to specify for a single current script within an application the
decimal and thousand separators to be used during string conversion,
temporarily replacing the application defaults defined in the
"Biferno.config.bfr" file (THOUSAND_SEP and DECIMAL_SEP). This method has
prototype:
void curScript.SetNumFormat(char thousSep, char decimSep)
The following example clarifies the use of these methods.
<?
a = 1234.567 // a is of classe double
b = a.ToString() //b is "1234,57" – notice rounding
c = a.ToString(true) //c is "1.234,57"
d = a.ToString(true, 1) //d is "1.234,6"
curScript.SetNumFormat(",",".")
e = a.ToString(true, 5) //e is "1,234.567"
f = a.ToString(true, 5, false) //f is "1,234.56700"
?>
The Eval function processes a string of text containing Biferno code. The
prototype is:
string Eval(string textToEval, boolean resume)
We can write:
<?
textToEval = "a = 3"
Eval(textToEval)
$a
?>
Line 3 of the example will print the value of the variable a, which is 3.
The a variable has been defined and assigned the value 3 when the text
passed to the Eval function was processed. Notice that the textToEval
string does not start with the "<?" characters. This is because the Eval
implicitly assumes a "<?" tag before processing the text. If the
textToEval string contains plain text, the text would have to be prefixed
with the "?>" characters. An example is:
<?
textToEval = "a = 3?><b>Hello Word</b>"
result = Eval(textToEval)
$result
?>
Notice that this behavior is different from the include behavior. In an
included file, before writing Biferno code, the "<?" must be explicitly
used.
The last example also shows the meaning of the return variable of the
function, which contains the entire text sent as output during execution.
The first example generated no output, and the function returned the empty
string. In the second example the result string contains the text:
"<b>Hello Word</b>".
A possible use of the Eval function is to process a text before sending it
via email. The following code sends three emails using a text file that is
interpreted via the Eval function:
<?
email_host = "mailserver.mydomain.com"
email_from = "me@mydomain.com"
email_to = "him@hisdomain.com"
userArr = array("John", "Bob", "Carl")
for (i = 1; i <= 3; i++)
{ username = userArr[i]
email_text = "Subject: SendMail Test \r\n\r\n"
email_text += Eval(file("myMailBody.bfr").Get())
status = smtp.SendMail(email_host, email_from, email_to, email_text)
}
?>
The text file is a template containing variable parameters that are subject
to change each time the code is run (in particular the username variable).
The content of the file could look like:
Dear $username$,
Your subscription has expired!
The Eval function can also be used to invoke a function (or class member)
whose name is contained in a variable and is not known in advance, as in
the following example:
<?
// myFunc is a value passed to the script (e.g. the string "Encode")
myString = "John & Co."
textToEval = "print(myString." + myFunc + "())"
// textToEval is "print(myString.Encode())"
result = Eval(textToEval)
?>
After execution of this script, assuming the string "Encode" has been
passed to the script in the myFunc variable, the result variable will have
the value: "John & Co.". Assuming the string "UrlEncode" was passed,
the value would be "John%20%26%20Co%2E".
Notice that the textToEval must contains the print command (or $, $$) to
output the result and therefore to be able to retrieve it from the result
variable.
What happens if the text passed to the Eval function (textToEval
string) generates an error? In this case the value of the third parameter
(resume, left to its default value, false, in the previous examples) is
crucial.
If an error is generated the following applies:
We can control if Eval will interrupt our script using the
error.Resume call in the textToEval string with the error handling
rules that will be described in Chapter 16, Error Handling and Debugging. Notice
that, as we will discuss in that chapter, some errors will interrupt
code execution even if error.Resume has been called, as e.g. the
Err_BadSyntax error. In any case, the code line after the call to the
Eval function the global variable global err will contain the code of
the generated error.
The Eval function will interrupt the execution of the text
contained in textToEval according to the error handling rules, but,
instead of interrupting the execution of the calling script upon an
error, will return the name of the generated error in the return
string. In any case, on the code line after the call to the Eval
function the global variable global err will contain the code of the
generated error.