File str_util.icn

Summary

General purpose string routines contributed by various people. This is one of several files contributing to the util package.

Authors: Steve Wampler (sbw@tapestry.tucson.az.us) Robert Parlett (parlett@dial.pipex.com) and others.

This file is in the public domain.

Procedures:
delPrefix, delSuffix, encodeCSV, encodeCsvField, genCSV, genFields, genFieldsOne, genWords, get_int, hideAllChars, hideEscapedChars, lPad, lTrim, lcp, lcs, listToString, mapBytes, parseCSV, rPad, replaceStrs, stringToList, zapPrefix, zapSuffix

Imports:
lang

This file is part of the util package.

Source code.

Details
Procedures:

delPrefix(s, prefix)

Parameters:
s
String to examine
prefix
Suffix to remove if present
Returns:
s with prefix removed, or s if prefix not found

Strip prefix from string s.


delSuffix(s, suffix)

Parameters:
s
String to examine
suffix
Suffix to remove if present
Returns:
s with suffix removed, or s if suffix not found

Strip suffix from string s.


encodeCSV(A, sep)

Parameters:
A
list to put into CSV-formatted string
sep
character used to separate fields (default ',')
Returns:
a CSV-formatted string for the list of fields

Produce a CVS-formatted string from a list of field values


encodeCsvField(field)

Parameter:
s
field to format
Returns:
the field in legal CSV format

Put a single field into legal CSV form.


genCSV(s, sep, spanFlag)

Parameters:
s
CSV string to parse
sep
set of characters (default ',') to separate fields
spanFlag
if non-null, indicates that fields are separated by one or more separators instead of just one
Generates:
the fields in CSV-formatted string s

Generate the fields from a CSV-encoded string. Fails on any field that isn't well-formed:

- fields with doubles quotes or embedded separators must be enclosed in double quotes
- double quotes in fields must be doubled: e.g. to embed "five", use ""five""
- leading and trailing whitespace (blanks and tabs) is stripped unless embedded inside a double-quoted field
- if blank and/or tab are used as separators, they are not considered whitespace
Also, any characters except double quotes and newlines may be used as separators.


genFields(s, cs:' \t\n')

Parameters:
s
String to examine
cs
Generates:
the fields in s that are _separated_ by sequences of one or more characters in cs.


genFieldsOne(s, cs:' \t\n')

Parameters:
s
String to examine
cs
Generates:
the fields in s that are _separated_ by exactly one of the characters in cs (i.e. can have empty fields). Leading characters in cs are treated as separating empty fields.


genWords(s, cs)

Parameters:
s
String to examine
cs
Characters comprising words (default: &letters)
Generates:
the "words" in s. By default, words are defined as sequences of 1 or more letters


get_int()


 Utility to get next integer


hideAllChars(s)

Parameter:
s
String of characters to hide.
Returns:
string with all characters mapped out of &ascii

Shift all the characters in a string into characters outside of &ascii. This procedure is intended as aid when using hideEscapedChars() by making it easier to construct the third argument, as in:

    s := hideEscapedChars(s, s1, hideAllChars(s1))
 


hideEscapedChars(s, s1:"(", s2, esc:'\\')

Parameters:
s
String to examine
s1
Look for these characters
s2
... and replace with these
esc
Escape character
Returns:
s with characters in s1 hidden by replacing them with corresponding characters in s2

Hide escaped instances of characters. Often, string scanning can be simplified if some characters are 'hidden' whenever they are escaped. (For example, consider running bal() on "(\))".) This procedure can be used to hide/unhide such escaped characters by converting them to other (typically in &cset--&ascii). Unhiding is accomplished by switching the second and third arguments.

The second and third parameters behave as in map(), except that s1 defaults to "(" and s2 defaults to a character string selected from &cset--&ascii. Note that accepting the defaults makes it more challenging to unhide the characters later!

The fourth parameter defaults to a backslash.

Probably doesn't work right when trying to hide escapes...


lPad(s, n, p)

Parameters:
s
string to pad
n
desired field width. The actual width will be max(*s, n)
p
string to use as padding

Pad a string on the left if it is too short. This method is similar to the function left except that lPad returns the original string if it is longer than the specified field, while left truncates the string to fit in the field.


lTrim(s, cs)

Parameters:
s
String to examine
cs
Characters to strip from beginning of <tt>s</tt>.
Returns:
s with prefix of characters in cs stripped

Trim all characters in cs from the left edge of s. Same parameters and defaults as trim(s,cs).


lcp(s1, s2)

Returns:
longest common prefix of s1 and s2

Given two strings, produces the longest prefix they share in common.


lcs(s1, s2)

Returns:
longest common suffix of s1 and s2

Given two strings, produces the longest suffix they share in common.


listToString(a, sep:"\n")

Parameters:
a
list to join elements from into a string
sep
character to join elements on (defaults to newline)

Convert a list of strings into a single string. <[turns string formed by joining elements in a.


mapBytes(s, in, out)

Parameters:
s
String to re-arrange
in
Input byte order, e.g. "1234"
out
Output byte order, e.g.: "4321"

Deprecated in favor of the MapBytes class

Do byte reorderings efficiently. Assumes that the length of string s is a multiple of the length of in. If this isn't the case, the remaining characters in s are appended unmapped.

The default action is the identity mapping

To work, in and out must be the same length.


parseCSV(s, sep, spanFlag)

Parameters:
s
CSV string to parse
sep
set of characters (default ',') to separate fields
spanFlag
if non-null, indicates that fields are separated by one or more separators instead of just one
Returns:
a list of the fields in the string s

Produce a list of the fields from a CSV-formatted string.


rPad(s, n, p)

Parameters:
s
string to pad
n
desired field width. The actual width will be max(*s, n)
p
string to use as padding

Pad a string on the right if it is too short. This method is similar to the function right except that rPad returns the original string if it is longer than the specified field, while right truncates the string to fit in the field.


replaceStrs(s, tbl)

Parameters:
s
String to examine
tbl
Table of replacement strings. Keys are substrings to locate, entries are corresponding replacements
Returns:
copy of s with replaced substrings

Given a string and a table of strings (keys and entries), replaces occurrences of the table key values with the corresponding entry values. Finds longest sequence matching a key value.

replaceStrs is most useful in simple cases where you have only a few replacements to make on comparatively short text. For large numbers of replacements across a lot of text, the StringReplacer class is likely to faster.


stringToList(s, sep:'\n')

Parameters:
s
string to convert into list of strings
sep
character to break string on (defaults to newline)
Returns:
list of strings from s

Convert of string with (possibly) embedded separators into a list.


zapPrefix(s, prefix)

Parameters:
s
String to examine
prefix
Prefix to find and remove
Returns:
the suffix on a successful match
Fails:
if prefix does not match

Does string s start with prefix?


zapSuffix(s, suffix)

Parameters:
s
String to examine
suffix
Suffix to find and remove
Returns:
the prefix on a successful match
Fails:
if suffix does not match

Does string s end in suffix?



This page produced by UniDoc on 2021/04/15 @ 23:59:54.