regexcomp, regexexec, createMatchArray

Format: regexcomp ( pattern, flags )
Format: regexexec ( regex, string, matcharray, [position, [flags]] )
Format: createMatchArray ( size )

Purpose: These functions provide access to the regular expression library pcre (1 2). Using regexcomp() a regular expression pattern can be compiled for later use. regexexec() uses a compiled regular expression (maybe several times) to match strings to this pattern. The exact syntax and semantics of the regular expressions is documented with the pcre library.

The regexexec() function returns an array of matching positions. It contains start and end positions for each parenthesized subexpressions within the regular expression. This array must be created before the first call with the createMatchArray() function.

Return Values: When successful, regexcomp() returns a compiled representation of the regular expression pattern (type tPcre). If regexcomp() fails, the compilation aborts with an error message.

regexexec() returns an integer denoting how many elements of matcharray are used if the regular expression matches the string, leaving the matching positions in matcharray, and returns null if not.

createMatchArray() returns a tuple of tuples. There are two array tuples, each of size size. The two arrays are fields of the main tuple with fieldnames start and end. The array fields of both arrays are initialized with integer 0s. The regexexec() function puts start position of each parenthesized subexpression into the start array and the corresponding end positions into the end array.

Position Argument:

The position argument must be an integer variable denoting the position in the string where the search starts.

Flags Argument:

The functions regexcomp() and regexexec() have an additional argument flags. It must be a string containing a comma or white space seperated list of flags. The following flags are supported and are documented in the pcre documentation.

regexcomp():

ANCHORED, BSR_ANYCRLF, BSR_UNICODE, CASELESS, DOLLAR_ENDONLY, DOTALL, DUPNAMES, EXTENDED, EXTRA, FIRSTLINE, JAVASCRIPT_COMPAT, MULTILINE, NEWLINE_CR, NEWLINE_LF, NEWLINE_CRLF, NEWLINE_ANYCRLF, NEWLINE_ANY, NO_AUTO_CAPTURE, NO_UTF8_CHECK, UNGREEDY, UTF8

regexexec():

ANCHORED, NEWLINE_CR, NEWLINE_LF, NEWLINE_CRLF, NEWLINE_ANYCRLF, NEWLINE_ANY, NOTBOL, NOTEOL, NOTEMPTY, NO_START_OPTIMIZE, NO_UTF8_CHECK, PARTIAL

Example:

heitml input:
<let 
    s="test@radpage.nospam";
    re = regexcomp ("(?x:  ([^@]+) @ (.+) \\. ([^.]+) )");

    m=createMatchArray(5);
    regexexec (re, s, m); ? m;
    adr=emptytuple;
    adr.mailboxname =     substring (s,m.start[1],m.end[1]);
    adr.hostname =        substring (s,m.start[2],m.end[2]);
    adr.topleveldomain =  substring (s,m.start[3],m.end[3]);
    ?adr;
>

resulting output:
start=([0]=0, [1]=0, [2]=5, [3]=13, [4]=null)
end=([0]=19, [1]=4, [2]=12, [3]=19, [4]=null)
(mailboxname="test", hostname="radpage", topleveldomain="nospam")

Example:

heitml input:resulting output:
<let re = regexcomp("((a+)(b+))((c*)(d*))","EXTENDED");
ma = createMatchArray (6);
n = regexexec(re,"aaabbbbbbcc",ma)>

<if !isnull(n)><? ma><else> 1st no match</if>
<let n = regexexec(re,"aaacc",ma)>
<if !isnull(n)><? ma> <else> 2nd no match</if>

start=([0]=0, [1]=0, [2]=0, [3]=3, [4]=9, [5]=9)
end=([0]=11, [1]=9, [2]=3, [3]=9, [4]=11, [5]=11)
2nd no match

Example:

heitml input:resulting output:
  <let 
    s= "Hello! This 55 test prints all Words no chars <> ! + and no 34 numbers";

    re = regexcomp ("[a-zA-Z]+");
    m = createMatchArray(1);
    pos = 0;

    n = regexexec (re, s, m, pos);
    while !isnull(n);
       ? substring(s,m.start[0],m.end[0]); ? "<br>";
       pos=m.end[0];
       n = regexexec (re, s, m, pos);
    /while;
  >

  Hello
This
test
prints
all
Words
no
chars
and
no
numbers

Example:

heitml input:resulting output:
  <let
    s= "Hello! This 55 test prints all Tokens <> ! + and 34 numbers";

    re = regexcomp ("([a-zA-Z]+)|([0-9]+)|([+!<>])");
    m = createMatchArray(4);
    pos = 0;

    n = regexexec (re, s, m, pos);
    while !isnull(n);
       if     !isnull(m.start[1]) ><\br>Word: <
       elsif  !isnull(m.start[2]) ><\br>Number: <

       elsif  !isnull(m.start[3]) ><\br>Punctation: <
       /if;
       ? substring(s,m.start[0],m.end[0]); ? " ";
       pos=m.end[0];
       n = regexexec (re, s, m, pos);
    /while
  >
  
Word: Hello
Punctation: !
Word: This
Number: 55
Word: test
Word: prints
Word: all
Word: Tokens
Punctation: <
Punctation: >
Punctation: !
Punctation: +
Word: and
Number: 34
Word: numbers

See Also: indexRE(), indexCaseRE(), containsRE(), containsCaseRE(), substRE(), substallRE().


This page was dynamically generated by the web application development tool RADpage of H.E.I.
© 1996-2017 H.E.I. All Rights Reserved.



Homepage
Intro/Features
Component Guide
Programming
  Language Guide
  Language Ref.
    General Design
    Lexical Structure
    Expressions
    Objects
    Methods
    Classes
    heitml Tags
    heitml Functions
    Advanced Functions
      containsRE
      containsCaseRE
      createMatchArray
      fileio_chmod
      fileio_fclose
      fileio_feof
      fileio_fopen
      fileio_fread
      fileio_fwrite
      fileio_strerror
      indexRE
      indexCaseRE
      regexcomp
      regexexec
      request_string
      request_tuple
      substr
      substRE
      substallRE
      TuSearch
      TuSearchTuple
      TuSort
    Database Access
    Global Variables
    Form Fields
    Server Variables
    Sessions
    heitml Syntax
  Component Ref.
  Class Library
  User Components
  Tryout Form
  Tutorial
  New Features
  heitml 1
User Guide
FAQ
Mailinglist
Discussion Group
Services
Pricing/Register
Download
Frame
 
Contact
 
 
 
Search: