SAS String Functions

Quick reference for character and string manipulation functions in SAS Base. All functions work in DATA step and most work in PROC SQL as well.

Length & Trimming

FunctionDescriptionExampleResult
LENGTH(str)Length including trailing spacesLENGTH('abc ')6
LENGTHN(str)Length excluding trailing spacesLENGTHN('abc ')3
TRIM(str)Remove trailing spacesTRIM('abc ')'abc'
STRIP(str)Remove leading and trailing spacesSTRIP(' abc ')'abc'
LEFT(str)Remove leading spaces (left-align)LEFT(' abc')'abc'
TRIMN(str)Remove trailing spaces, returns empty string (not blank) if all spacesTRIMN(' ')''

SAS character variables are fixed-length. Trailing spaces are always present up to the variable's defined length. Use STRIP or TRIM when concatenating.

Case Conversion

FunctionDescriptionExampleResult
UPCASE(str)Convert to uppercaseUPCASE('Hello')'HELLO'
LOWCASE(str)Convert to lowercaseLOWCASE('Hello')'hello'
PROPCASE(str)Title case (first letter of each word)PROPCASE('hello world')'Hello World'
PROPCASE(str, delims)Title case with custom delimitersPROPCASE('hello-world', '-')'Hello-World'

Substrings & Position

FunctionDescriptionExampleResult
SUBSTR(str, pos, len)Extract substring (1-indexed)SUBSTR('abcdef', 2, 3)'bcd'
SUBSTR(str, pos)From position to endSUBSTR('abcdef', 4)'def'
INDEX(str, substr)Position of first occurrence (0 if not found)INDEX('abcabc', 'bc')2
INDEXC(str, chars)Position of first character from setINDEXC('ab12', '0123456789')3
INDEXW(str, word)Position of whole wordINDEXW('one two', 'two')5
FIND(str, substr, dir, start)Find with direction and startFIND('abcabc','bc','B')5 (from back)
FINDC(str, chars, dir)Find character from set with directionFINDC('abc123','0123456789')4
Note: INDEX returns 0 when not found (not -1 like many other languages). FIND also returns 0 when not found.

Concatenation

MethodDescriptionExampleResult
||Concatenate (preserves trailing spaces)'abc ' || 'def''abc def'
CAT(args)Concatenate (preserves trailing spaces)CAT('abc', ' ', 'def')'abc def'
CATS(args)Concatenate stripping all trailing spacesCATS('abc ', 'def')'abcdef'
CATX(sep, args)Concatenate with separator, stripping spacesCATX('-', 'a', 'b', 'c')'a-b-c'
CATT(args)Concatenate trimming trailing spaces onlyCATT(' a ', ' b')' a b'
Best practice: Use CATS or CATX instead of || to avoid unwanted trailing spaces from fixed-length character variables.

Replace & Translate

FunctionDescriptionExampleResult
TRANWRD(str, from, to)Replace all occurrences of a word/stringTRANWRD('a b a', 'a', 'x')'x b x'
TRANSLATE(str, to, from)Replace characters one-for-oneTRANSLATE('abc', 'xyz', 'abc')'xyz'
COMPRESS(str, chars, mods)Remove specified charactersCOMPRESS('a1b2c3','0123456789')'abc'
COMPRESS(str, '', 'kd')Keep only digits (modifier k = keep)COMPRESS('a1b2','','kd')'12'
PRXCHANGE(regexp, n, str)Regex replace (n times, -1 = all)PRXCHANGE('s/\d+/X/', -1, 'a1b22')'aXbX'

COMPRESS modifiers: a=letters, d=digits, s=spaces, p=punctuation, k=keep (instead of remove).

Padding & Alignment

FunctionDescriptionExampleResult
REPEAT(str, n)Repeat string n+1 timesREPEAT('ab', 2)'ababab'
SUBSTR(var, 1, n) = strPad by assigning to fixed-length varVar length 10, assign 'abc''abc '
PUT(num, z5.)Zero-pad a number as stringPUT(42, z5.)'00042'

Type Conversion

FunctionDescriptionExampleResult
INPUT(str, informat)Character → numericINPUT('3.14', 8.2)3.14
PUT(num, format)Numeric → characterPUT(3.14, 8.2)' 3.14'
INPUT(str, $char20.)Read character with informatReads up to 20 chars

Pattern Matching (Regex)

FunctionDescriptionExample
PRXMATCH(regexp, str)Returns position of match (0 if none)PRXMATCH('/\d+/', 'abc123')4
PRXPARSE(regexp)Compile regex, returns pattern IDpid = PRXPARSE('/\d+/');
PRXCHANGE(regexp, n, str)Replace with regexPRXCHANGE('s/\s+/ /', -1, str)
PRXPOSN(pid, cap, start, len)Get capture group position & lengthAfter PRXNEXT or PRXMATCH
/* Compile once for performance */
pid = prxparse('/(\d{4})-(\d{2})-(\d{2})/');
if prxmatch(pid, date_str) then do;
  call prxposn(pid, 1, start, len);
  year = substr(date_str, start, len);
end;

Common Patterns

/* Check if string contains only digits */
is_numeric = (compress(str,'','kd') = str and str ne '');

/* Extract digits from mixed string */
digits_only = compress(str, '', 'kd');

/* Trim and collapse internal spaces */
clean = prxchange('s/\s+/ /', -1, strip(str));

/* Left-pad number to fixed width */
padded = put(id, z8.);

/* Case-insensitive search */
if index(upcase(str), upcase(search_term)) > 0;

/* Split on delimiter (first part) */
first_part = scan(str, 1, ',');

/* Count occurrences of substring */
n = (length(str) - length(compress(str, target))) / length(target);

Word & Token Parsing

FunctionDescriptionExampleResult
SCAN(str, n, delim)Return nth word (negative = from end)SCAN('a,b,c', 2, ',')'b'
SCAN(str, -1)Last word (default delimiters)SCAN('one two three', -1)'three'
COUNTW(str, delim)Count words/tokensCOUNTW('a,b,c', ',')3
WORD(str, which, delim)Alias for SCAN in some contexts