We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7
Alphabets and Languages:
the mathematics of strings
CSC320 Strings and symbols • An alphabet is a finite set of symbols, e.g., the binary or Roman alphabet. We denote an arbitrary alphabet by Σ • A string over an alphabet is a finite sequence of symbols from the alphabet. • The empty string is the string with no symbols and is denoted ". • The set of all strings, including the empty string, over an alphabet is denoted Σ ∗ . • What is the cardinality of Σ ∗? • The length of a string is its length as a sequence. • There is only one string of length 0. What is it? • The length of a string $ is denoted |$|. • A string w of length n canbe denoted w1 w2 …wn . The symbol in the &th position is denoted $' . We say that symbol $' occurs in position &. A symbol may have more than one occurrence in a string. Operations and relations on strings • The operation of concatenation takes two string ! and " and produces a new string !" by putting them together end to end. The string !" is called the concatenation of ! and ". • Concatenation is an associative operation. So we will write, e.g., !"# for !" # or !("#) • A string & is a substring of a string ' iff there are strings ! and " such that ' = !&". If " = ) then & is a suffix of '. If ! = ) then & is a prefix of '. • We write ! * for the string obtained by concatenating + copies of !. • The reversal of a string ', denoted ' , is the string ' “written backwards”. Languages: Sets of strings • A language is set of strings over an alphabet. • We may apply set operations like union, intersection, and set difference to languages. ∗ • The complement of a language ! is Σ −!, and is denoted !̅ if Σ is understood. • If '( and ') are languages over Σ, their concatenation is
' = {, ∈ Σ ∗ ∶ , = /0 for some / ∈ '( and 0 ∈ ') }
• Denoted '( · ') or '( ')
Kleene star
• The (Kleene) star of a language !, denoted !∗ is the set of all strings
obtained by concatenating zero or more strings from !. Thus,
• Examples: The star of Σ is Σ ∗ ; The star of ∅ is 9
• !; denotes !!∗ and is the closure of ! under concatenation. That is, it is the smallest language that includes ! and all strings that are concatenations of strings in !. Representing a language with a finite specification • The vast majority of languages over a finite alphabet cannot be represented by a finite specification. • Why not? • The set Σ ∗ of strings over a finite alphabet Σ is countably infinite, (i.e., we can construct a bijection #: ℕ → Σ ∗ ) • A specification for a language is given by a string over a finite alphabet. Therefore, the set of specifications countably infinite, or even finite. • But the set of possible languages is the set of subsets of Σ ∗ , i.e., it is the power set of a countably infinite set. It is therefore uncountably infinite (Cantor’s argument.) • What languages can we specify? This is the primary question we will address in this course Languages and Problems • Recall from the first lecture that we said we will be concerned with computational solutions to problems. • A problem is a mapping from problem instances to !"#, %&. • Languages may be viewed as an abstract representation of problems. For a problem Π, the associated language is
() = {, ∈ Σ ∗ : , is a !"# instance of Π}
• So studying “specifiable” languages is analogous to studying