The concatenation operator is not necessary if
constants are coded inline:

ShowMessage('This string
displays'#13#10'on two lines');

Each line of an ASCII file in a DOS/Windows environment is terminated with
the pair: Carriage Return (13 or $0D), Line Feed (10 or $0A). In many Unix files, a line
in an ASCII file is terminated only with a Line Feed ($0A).

ANSIChar: one byte per character. Some day WideChars (two
bytes per charcter) may become the default.

pChar (allocate with strNew or strAlloc)
ARRAY[0..Length-1] OF CHAR is automatically treated as a pChar

String Types

Version

Type

Maximum Length

Memory Required

Used for

Notes

D1

StringString[n]

255 characters

2 to 256 bytes

n IN 0..255

D2-D5

String[n]
(n IN 0..255)ShortString

backward compatibility

n IN 0..255

D2-D5

String,AnsiString

~231 characters

4 bytes to 2 GB

8-bit ANSI characters

Sometimes called "long string." Preferred type for most
purposes.

D2-D5

WideString

~230 characters

4 bytes to 2 GB

Unicode characters; COM servers and interfaces

Notes (D2 or later):

String types can be mixed in assignments and expressions since the
compiler will automatically perform required conversions.

Strings passed by reference (as var and out parameters)
must be of the appropriate type.

ShortString (D2) maintains 255-character limitation for compatibility with
"old" strings (and sometimes for storage efficiency). There are two ways
to declare short string even when the "Hugh String" {$H+} compiler option is
turned on. Declare a ShortString or define the length of less than 256 characters:
MyString: STRING[255];

A ShortString is really just an array that has a few additonal properties.
A long string is a pointer to a null-terminated string. It is an unusual
pointer in that it has data at a negative offset.

ANSIString (Default in D2) or "long strings."
ANSIStrings act very much like ShortStrings except when you try to tream them as an array.

Delphi's string functions are highly optimized and, in general, they're
easier to use and more reliable -- even if you're familiar with C/C++ null-terminated
strings and string pointers.

The Windows API uses C-style strings. If you call Windows API
routines in Delphi you must use C-style strings. In D1, if you need strings larger
than 255 characters (and less than 65,526 bytes including the terminator) you might use a
null-terminated string.

The first character in a null-terminated string is s[0]. The first
character in a Pascal string is s[1].

Pascal strings can be compared using the usual logical operators <,
> , <=, >=, and -. Use the StrComp function to compare null-terminated
strings.

Use the Pascal Chr function to assign ASCII values as characters to a
string element: s[5] := Chr(65);

The type pChar is a pointer to a CHAR. Use this type for a
null-terminated string.

You can directly assign a pChar to a long string.

WideChars: In Delphi
you can access WideChars (since they are accessible from the Windows API) but you cannot
dipslay Unicode strings in Delphi controls. A WideChar has two bytes per character.

Reference Counting is automatically performed on long strings.
If you assign one AnsiString to another, Delphi does not necessarily copy the
string into a new place in memory. Delphi increments a variable that keeps track of
the number of references to the string. If a reference to the string goes out of
scope, the memory may not be deallocated but the reference count is decremented.
When the reference count becomes zero, the memory is deallocated. See
"Reference Counting," Delphi in a
Nutshell, pp. 58-61 for additional details.

According to Bob Lee in a UseNet
Post: "D5, and presumably all future versions, convert short
strings to longstrings before doing anything with them. The end result is
that you are seriously penalized for using short strings."

B. Routines

1A. "Standard" Pascal/Delphi Strings and
Character Routines

AdjustLineBreaks

Function AdjustLineBreaks(const S:
string): string;

AdjustLineBreaks adjusts all line breaks in the
given string S to be true CR/LF sequences. The function changes any CR characters not
followed by a LF and any LF characters not preceded by a CR into CR/LF pairs. It also
converts LF/CR pairs to CR/LF pairs. The LF/CR pair is common in Unix text files.
(SysUtils)

CompareStr compares S1 to S2, with case-sensitivity. The return
value is less than 0 if S1 is less than S2, 0 if S1 equals S2, or greater than 0 if S1 is
greater than S2. The compare operation is based on the 8-bit ordinal value of each
character and is not affected by the current Windows locale.

Call IsDelimiter to determine
whether the character at byte offset Index in the string S is one of the
delimiters in the string Delimiters. Index is the 0-based index of the byte in question,
where 0 is the first byte of the string, 1 is the second byte, and so on.

When working with a multi-byte character system (MBCS), IsDelimiter checks to make sure
the indicated byte is not part of a double byte character. The delimiters in the
Delimiters parameter must all be single byte characters. (SysUtils)

LastDelimiter

function LastDelimiter(const Delimiters,
S: string): Integer;

Call LastDelimiter to locate the last delimiter in
S. For example, the line

MyIndex := LastDelimiter('\.:','c:\filename.ext');

sets MyIndex to 12.

When working with multi-byte character sets (MBCS), S may contain double byte characters,
but the delimiters listed in the Delimeters parameter must all be single byte non-null
characters.

Length

FUNCTION Length(s: STRING):
INTEGER;

Returns the dynamic length of a string (TP).

LowerCase

NewStr

NullStr

const NullStr: PString = @EmptyStr;

NullStr
is the return value for many string-handling routines when the string is empty.

WrapText scans a string for occurrences of any of the characters specified by
nBreakChars and inserts a line-break, specified by BreakStr, at the last occurrence of a
character in nBreakChars before MaxCol. Line is the text WrapText scans. MaxCol is the
maximum line length.

If the BreakStr and nBreakChars parameters are omitted, WrapText searches for space,
hyphen, or tab characters on which to break the line and inserts a carriage return/line
feed pair at the break points.

WrapText does not insert a break into an embedded quoted string (both single quotes and
double quotes are supported).

For example, the following call wraps the text into two lines at the last space character:

See Neil Rubenking's WrapLabel
function in the AllFuncs.pas unit in his ColorCluePC Magazine utility. WrapLabel sets the label's caption to
the specified string with sensible word-wrap even if the string contains
no spaces.

TMemo

Windows 95/98 imposes a size limit of about 32 KB on what can be put in a TMemo.
This size limit does not affect Windows NT. One way arround this problem is
to use a TRichEdit object with PlainText := TRUE since it does not have
this size limitation.

According to Guido Festraetsin a 27 Aug 99 UseNet Post:
"In D4 ... [the size limit] ... is 64K, but this strangely seems to depend [on] ...
the MaxLength setting. If you leave it at 0, which should mean "no limit", the
limit is in fact 32K (under Win95). Set it to a higher level, and you can go up to
64K."

While Borland cannot be faulted for this limitation (blame Microsoft), they can be
faulted for not mentioning this limitation in any online or printed documentation.

To avoid size limit of TMemo, use the TRichEdit control.

TRichEdit

To use TRichEdit as a replacment for a TMemo, just set
the PlainText property to TRUE.

Neil Rubenking's CharEntity
in the AllFuncs.pas unit of his ColorClue
utility.
For example, given #034, returns 'quot', or #060, returns 'lt'.

ClearString

// Possibly FillChar could be used here if speed were
more important.
PROCEDURE ClearString (VAR s: STRING);
VAR i: INTEGER;
BEGIN
IF LENGTH(s) > 0
THEN BEGIN
FOR i := 1 TO LENGTH(s) DO
s[i] := #$00
END
END {ClearString};

Cross Reference Tool. Marco Cantł describes a clever utility which
cross-references the variables, functions, procedures and more from all your source code
files, then presents the results as HTML files; he also shows how the same techniques can
be used to publish databases on the web. Delphi Magazine, Issue 30,
February 1998

ExcludeTrailingBackslash (D5)

FUNCTION
ExcludeTrailingBackslash(CONST S: STRING): STRING;

Use ExcludeTrailingBackslash to modify a path name (specified by the S
parameter) so that it does not end with a backslash character (\).
If S does not end in a backslash character, ExcludeTrailingBackslash
returns a copy of S.

ExpandEnvironment
Strings

Neil Rubenking's EE function
in the AllFuncs.pas unit of his ColorClue
utility is a wrapper for ExpandEnviornmentStrings so it works properly in
both WinNTx and Win9x.

Use IncludeTrailingBackslash to modify a path name (specified by the S
parameter) so that it ends with a backslash character (\). If S
already ends in a backslash character, IncludeTrailingBackslash returns a
copy of S.

Call MatchesMask to check the
Filename parameter using the Mask parameter to describe valid values. A valid mask
consists of literal characters, sets, and wildcards.

Each literal character must match a single character in the string. The comparison to
literal characters is case-insensitive.

Each set begins with an opening bracket ([) and ends with a closing bracket (]). Between
the brackets are the elements of the set. Each element is a literal character or a range.
Ranges are specified by an initial value, a dash (-), and a final value. Do not use spaces
or commas to separate the elements of the set. A set must match a single character in the
string. The character matches the set if it is the same as one of the literal characters
in the set, or if it is in one of the ranges in the set. A character is in a range if it
matches the initial value, the final value, or falls between the two values. All
comparisons are case-insensitive. If the first character after the opening bracket of a
set is an exclamation point (!), then the set matches any character that is not in the
set.

Wildcards are asterisks (*) or question marks (?). An asterisk matches any number of
characters. A question mark matches a single arbitrary character.

MatchesMask returns True if the string matches the mask. MatchesMask returns
false if the string does not match the mask. MatchesMask raises an exception if the mask
is syntactically invalid.

Note: The Filename parameter does not need to be a file name. MatchesMask can be used to check strings against any syntactically correct mask.

// Used mostly to get rid of '&' in user interface
strings
FUNCTION RemoveChar(CONST s: STRING; CONST c: CHAR): STRING;
VAR i: INTEGER;
BEGIN
RESULT := '';
FOR i := 1 TO LENGTH(s) DO
BEGIN
IF s[i] <> c
THEN RESULT := RESULT + s[i]
END
END {RemoveChar};

A Simple Spelling Checker. Bob Swart describes the implementation of
a spelling checker using minimal resources: ideal for when the Full Monty would be over
the top! Delphi Magazine, Issue 31, March 1998

efg'sTToken class
uses a finite state machine to recognize tokens delimited with specified
"markers" and "separators." Can be used to tokenize multi-word
tokens enclosed in quotes, e.g., <This is "a very long" line> could be
tokenized into four tokens: 1. This, 2. is, 3. a very long, 4. line.
(updated Sept 2000)

Finite State Machine to Recognize Tokens

*Multiple separators can be treated as a single separator (e.g.,
space or tab white space), or treated as multiple null tokens (e.g., comma delimited
data).

Clipper Functions contains more than 140 xBASE-syntax compatible
functions for working with strings and dates, as well as many functions for converting
numbers into various formats and low-level drive and disk functions. These functions will
be very familiar to anyone who's done any xBASE (dBASE, FoxPro, or Clipper) work - many
will be familiar to those who've worked with other development tools (such as Visual
Basic) as well. http://members.aol.com/clipfunc

ESBRoutines v1.4. Miscellaneous Routines to supplement
SysUtils for Delphi 3 and Delphi 4 - though it should work well in
Delphi 2. Including 32-bit and 16-bit Bit Lists, Block Operations, String
Manipulation, Conversions and Environment Routines. Includes Help File & Full Source.
Freeware. www.esbconsult.com.au/esbrtns.zip

StringL offers some means to process strings. A collection of
procedures and functions provide several often used string formatting functions. In
addition, STRINGL contains the class TGrep which offers a versatile string search utility
using the wide-spread definition of regular expressions as e.g. in the Unix command
"grep". www.lohninger.com/stringl.html

C "char**" versus Delphi Strings in Delphi-JEDI
Digest 74 (10 Sep 99), Danny Thorpe suggests: An array of Pchar is an array of 4
byte pointers to string data. Andre's description refers to the last string having a
double-null terminator. That's most commonly used when you're dealing with a buffer
containing multiple embedded null-terminated strings. It is not an array of
pointers.

If passing this data to a function, you can construct the string like this:

PChar(string1 + #0 + string2 + #0 +
string3 + #0 + #0)

If receiving the data from the function, you can peel off each string in the buffer in
sequence using StrEnd + 1:

Starting in D5 referring to strings and dynamic
arrays will be thread-safe.

Reference counting for Long Strings is not thread
safe on multi-processor machines.Bug
fixed in Delphi 5.

Ray Lischner's UseNet Post about thread-safe
strings in D5. Follow other posts in this tread about the performance penalty
this introduces, especially posts by Robert Lee, who is an optimization expert.