File scan_util.icn

Summary

General purpose scanning routines contributed by various people.

This is one of several files contributing to the util package.

Author: Steve Wampler (sbw@tapestry.tucson.az.us)

This file is in the public domain.

Classes:
FindFirst

Procedures:
WS, findFirst, isEscapeSeq, matchCSet, matchString, matchVar, sbal, scanlocus, skipOver, skipTo, snapshot, tabPast, tabSkip, ws

Imports:
lang

This file is part of the util package.

Source code.

Details
Procedures:

WS()

Returns:
skipped over whitespace, if any
Fails:
if no whitespace

Skip whitespace. This is a matching procedure.


findFirst(a)

Like find, but accepts a list of strings and finds them by their order of appearance in the subject string. The list of strings is searched in order. To locate the longest substring first, sort the list in reverse order, as in:

     findFirst(::reverse(::sort(a)))
 

Deprecated in favor of the FindFirst class.

List of strings to look for.


isEscapeSeq(s, esc:'\\')

Parameters:
s
String to examine
esc
Escape character (defaults to <tt>\</tt>)
Returns:
s if it ends in an odd number of escape characters

Succeed if string s ends in an odd number of escape characters. This is a specialty procedure - it's intended to simplify the task of determining if the 'next' character of the string that has s as a substring is escaped or not. If this procedure succeeds, that 'next' character is escaped and s is returned.

The second parameter defaults to a backslash, the traditional escape character.

This is not a scanning procedure, but is placed in this source file to avoid circular dependencies when building.


matchCSet()

Generates:
matched substrings

Tabmatch a Unicon cset This is a matching procedure.


matchString()

Generates:
matched substrings

Tabmatch a Unicon string This is a matching procedure.


matchVar()

Returns:
matched Unicon variable name

Tabmatch past a Unicon variable name. This is a matching procedure.


sbal(keyStrings)

Parameter:
keyStrings
table of start->stop pairs
Generates:
matching substrings

Match strings the same way bal() matches characters. The input table is a tagged set of strings to match. The key is the start string while the value is the end string. This makes sure the start and stop strings are balanced w.r.t each other.

For example, given the table:


     t := table()
     t["begin"] := "end"

then the code:

     if match("begin") then
        clause := sbal(t)

assigns to clause the substring from begin through the matching end. (Assuming, of course, that there are no conflicts along the way...)

This is an unoptimized preliminary version that may contain bugs.


scanlocus(prefix:"", n:8)

Parameters:
prefix
Precedes any output (default is "")
n
Returns:
an empty string, corresponding to a move(0)

Similar to snapshot(), but displays only a portion of very long scan subjects. This is a (trivially) matching procedure.

amount of &subject to show (default is 80)


skipOver(n)

Parameter:
n
number of characters to move over in subject string

Skip over n characters in subject string. This is similar to move(), but doesn't construct a substring.


skipTo(p)

Parameter:
p
position to jump to in subject string

Skip to a position in the subject string. This is similar to tab(), but doesn't construct a substring.


snapshot(prefix:"")

Returns:
an empty string, corresponding to a move(0)

Produce a 'snapshot' of string scanning showing the current scanning position. This is a (trivially) matching procedure.

Precedes any output (default is "")


tabPast(cs)

Generates:
matched substrings

Tabmatch past the next unescaped character in cs Fails if no such unescaped character exists. This is a matching procedure.

cset of characters to tab past.


tabSkip(s)

Returns:
tabbed over portion, omitting skipped substring

Produce everything up to a substring and skip past that substring. This is a matching procedure. On success, the scanning position is left after the substring.

substring to tab up to and skip over


ws()

Returns:
skipped over whitespace, if any

Matches 0 or more whitespace characters (Cannot fail) This is a matching procedure.



This page produced by UniDoc on 2021/04/15 @ 23:59:54.