\documentclass{article}
\usepackage[english]{babel}
\usepackage{ucs}
\usepackage{amsfonts, amsmath, amssymb}
\usepackage{theorem}
\SetUnicodeOption{mathletters}
\usepackage[utf8x]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{hyperref}
\usepackage[pdftex]{graphicx}
% One-inch margins
\usepackage{fullpage}
% A more PDF-friendly font
\usepackage{lmodern}
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\begin{document}
\title{CS 483 - Review guide}
\author{Michael Olson and Ryan Stutsman}
\date{December 10, 2006}
\maketitle
\begin{center}
Version 2.2
\bigskip
\emph{Condense soup, not books! -- Anonymous}
\end{center}
\bigskip
\tableofcontents
\clearpage
\subsection*{Preface}
\begin{verse}
Copyright \copyright{} 2006 Michael Olson and Ryan Stutsman
\end{verse}
This review guide may be used in whole or in part for any purpose, as
long as proper credit is given to the authors of this review guide.
\begin{description}
\item[Disclaimer] \mbox{}
\textbf{This review guide is neither guaranteed nor warranted to be
accurate.}
The authors of this review guide explicitly disclaim responsibility
for any damage (academic or otherwise) that may result from use of
the content contained in this guide.
The contents of this review guide are original (except for the
Theorems and Lemmas) and are not known to infringe on any copyright or
patent claims. The authors have used ideas from both the course
textbook (written by Sipser) and the lecture notes and homework
(written by Frederickson).
To the best of our knowledge, this review guide is not endorsed by any
Purdue University faculty member.
\end{description}
\begin{description}
\item[Availability] \mbox{}
The source code for this review guide is available at
\begin{verse}
\url{http://www.mwolson.org/notes/manual/cs483review.tex}
\end{verse}
The latest version of this document is available in PDF form at
\begin{verse}
\url{http://www.mwolson.org/notes/manual/cs483review.pdf}
\end{verse}
Contributions of effort and corrections are welcome, but not
mandatory. If you wish to receive updates via email when changes are
made, email \texttt{mwolson@member.fsf.org}.
\end{description}
\begin{description}
\item[Notation] \mbox{}
We use formatting like this: $L, w, f(w)$ when describing any
language, string, or function that is used as a ``parameter''.
We use formatting like this: PSPACE, HALT$_{TM}$ when describing a
class of languages or a particular language.
\end{description}
\clearpage
\section{Glossary}
\subsection{Uncountable numbers}
\begin{description}
\item[diagonalization]
Used to determine $|A| = |B|$.
A set is countably infinite if there exists a mapping of each element
to $\mathbb{N}$. To demonstrate uncountability, we ``enumerate'' the
set and demonstrate that the enumeration is not complete.
Diagonalization achieves this by creating such an enumeration and then
showing an element is missing from the enumeration by showing there is
an element in the original set that differs from every other item in
the enumeration by at least one of its components.
\item[Real numbers ($\mathbb{R}$)]
An uncountably infinite set.
This is demonstrated by the fact that if a set of real numbers is
mapped to $\mathbb{N}$ there always exists some number $\in
\mathbb{R}$ for which the first digit does not match the first digit
of the first element in the enumeration, the second digit does not
match the second digit of the second element in the enumeration, and
so on.
\end{description}
\subsection{Language classes}
\begin{verse}
NC¹ ⊆ L ⊆ NL ⊆ NC² ⊆ NC ⊆ P ⊆ NP ⊆ PSPACE ⊆ EXPTIME
\end{verse}
None of the above relations have been proven to be strict.
\subsubsection{Not recognizable}
\noindent Examples:
\begin{itemize}
\item co-ATM
($\overline{ATM}$)
\end{itemize}
\subsubsection{Undecidable}
\label{sec:undecidable}
\begin{theorem}
Language is decidable ⇔ it is Turing-recognizable and
co-Turing-recognizable.
\end{theorem}
\bigskip
\noindent Examples:
\begin{itemize}
\item A$_{TM}$ = \{<$M$, $w$> | $M$ is a T.m. and $M$ accepts $w$\}.
\item HALT$_{TM}$ = \{<$M$, $w$> | $M$ is a T.m. and $M$ halts on input $w$\}.
\item E$_{TM}$ = \{<$M$> | $M$ is a T.m. and L($M$) = $\emptyset$\}.
\item EQ$_{TM}$ = \{<$M_1$, $M_2$> | $M_1$ and $M_2$ are T.m.'s
and L($M_1$) = L($M_2$)\}
\item PCP: Determine whether a collection of dominoes may be arranged
such that the string formed by the top of the row of dominoes matches
the string formed by the bottom.
\end{itemize}
\subsubsection{P}
P is the class of languages for which a polynomial time T.m. are known
to exist.
\begin{theorem}
Every context-free language is a member of P.
\end{theorem}
\begin{theorem}
If $A \leq_p B$ and $B \in$ P, then $A \in$ P.
\end{theorem}
\noindent Examples:
\begin{itemize}
\item PATH = \{<$G$, $s$, $t$> | $G$ is a directed graph that has a
directed path from vertex $s$ to vertex $t$\}.
\end{itemize}
\subsubsection{NP}
NP is defined as the class of languages for which a polynomial-time
verifier exists. It is more conveniently described as the class of
languages for which a non-deterministic polynomial time T.m. are known
to exist.
\bigskip
\noindent Examples:
\begin{itemize}
\item HAMPATH = \{<$G$, $s$, $t$> | $G$ is a directed graph with a
Hamiltonian path from $s$ to $t$\}. A Hamiltonian path includes every
node in the graph exactly once, with no repeats. In this case, it
begins with node $s$ and ends with node $t$.
\item CLIQUE = \{<$G$, $k$> | $G$ is an undirected graph with a $k$-clique\}.
A clique is a component of the graph where every node is connected with
every other node. A $k$-clique is a clique of size $k$.
\item SUBSET-SUM = \{<$S$, $t$> | $S$ is a set of numbers that may be
partitioned into two sets that have the same sum, and that sum is $t$\}.
\item SAT = \{<$\phi$> | $\phi$ is a satisfiable Boolean formula\}.
That is, we can find an assignment of \textbf{true} or \textbf{false} to
each of the variables in $\phi$ such that the formula yields \textbf{true}.
\item 3SAT = \{<$\phi$> | $\phi$ is a satisfiable Boolean formula which
is in 3-cnf form\}. That is, the formula is grouped into
parenthesized groups of three logical OR's, connected by AND's.
\end{itemize}
\paragraph{NP-complete} \mbox{}
\bigskip
NP-complete languages are those for which if one particular language
in the class was known to be in P, the remainder of the languages in
the class would also be in P.
Additionally, if one such language were found, we would know that P =
NP.
\begin{description}
\item[Cook-Levin theorem] \mbox{}
SAT $\in$ P ⇔ P = NP.
\end{description}
The proof that SAT is NP-complete uses a tableau to reduce any
language $A$ in NP to SAT in polynomial time. If any row of the
tableau is an accepting configuration for $A$, $A$ will accept the
given input. The top row of the tableau is the starting configuration
for $A$.
\bigskip
\noindent Examples:
\begin{itemize}
\item SAT
\item 3SAT
\item CLIQUE
\item VERTEX-COVER = \{<$G$, $k$> | $G$ is an undirected graph that has
a $k$-node vertex cover\}. That is, there is a some $k$-node subset of
the vertices of $G$ for which every node of $G$ touches one of these
nodes.
\item HAMPATH
\item UHAMPATH is a variant of HAMPATH that has an undirected graph,
and an undirected path through the graph.
\item SUBSET-SUM
\end{itemize}
\subsubsection{PSPACE}
This is the class of languages that take polynomial (according to the
length of the input) amounts of space.
\begin{description}
\item[Savitch's theorem] \mbox{}
$\forall f: \mathbb{N} \to \mathbb{R}^+$, where $f(n) \geq n$,
NSPACE($f(n)$) $\subseteq$ SPACE($(f(n))^2$).
\end{description}
This is proven by constructing an algorithm called CANYIELD. This
algorithm takes a nondeterministic T.m. $N$ (implicitly), and the
parameters $c_1$, $c_2$, and $t$. It \underline{accepts} if $N$ can
go from configuration $c_1$ to $c_2$ (along any nondeterministic path
of $N$) in $t$ or less steps.
The result of this theorem is that PSPACE = NPSPACE.
\paragraph{PSPACE-complete} \mbox{}
\bigskip
This is the class of languages that is both in PSPACE and for which
every language $A$ in PSPACE is polynomial time reducible to it.
\bigskip
\noindent Examples:
\begin{itemize}
\item TQBF = \{<$\phi$> | $\phi$ is a true fully quantified Boolean
formula\}. ``Fully quantified'' means that $\phi$ has all of its
quantifiers in front, and its variables and operators following. The
formula is ``true'' iff there is no assignment of values to variables
that can make it false.
\item FORMULA-game: This is the same as TQBF, but written as a
graph-traversal game.
\item GG (generalized geography): The generic form of a
graph-traversal problem, for which we are given a graph $G$ and a starting
node $b$.
\end{itemize}
\subsubsection{EXPTIME}
This is the class of languages for which an exponential time
T.m. exists.
\subsubsection{L}
L is the class of languages that are decidable in log space on a
deterministic TM. That is, L = SPACE($\log{n}$).
Log space is a machine that has a read-only input tape, and a
read/write work tape where no more than $O(\log{n})$ space is used on
the work tape.
\subsubsection{NL}
Same as L except L = NSPACE($\log{n}$) (the machine is equipped with
non-determinism). Note that space is defined as the maximum amount of
space used on any branch of computation.
\paragraph{NL-complete} \mbox{}
\bigskip
\noindent $B$ is NL-complete iff
\begin{enumerate}
\item
$B \in$ NL, and
\item
$\forall A \in$ NL, $A \leq_LB$
That is, $A$ is log space transducible to $B$.
\end{enumerate}
Log space transduction is performed by some TM with a read-only input
tape, a write-only output tape, and only uses $O(\log{n})$ space on
its read/write work tape.
In actual proof situations, we show $B \in$ NL, then demonstrate $A
\leq_L B, A \in$ NL, and then claim since $A \in$ NL that $B \in$ NL
as well by the transitivity provided by $\leq_L$. That is, any
problem in NL can be solved by $B$ with a log space transduction since
$A$ can be solved by $B$ using a log space transduction and any $C
\in$ NL could be transduced to $A$ to begin with. Props to
transitivity.
\bigskip
\noindent Examples:
\begin{itemize}
\item
PATH is NL-complete.
\end{itemize}
\subsubsection{NC}
NC$^i$, for $i \geq 1$, is the class of languages which may be decided
by a uniform family of circuits with polynomial size and
$O((\log{n})^i)$ depth. NC is the union of all such classes.
\subsection{Mechanisms}
\subsubsection{Deterministic finite automaton}
This is abbreviated ``DFA''.
\begin{verse}
G = (Q, Σ, δ, q₀, F)
\end{verse}
% Explain each symbol
\subsubsection{Non-deterministic finite automaton}
This is abbreviated ``NFA''.
\begin{verse}
G = (Q, Σ, δ, q₀, F)
\end{verse}
The meaning of the symbols is the same for that of the DFA.
% This is closely related to such-and-such kind of language.
\subsubsection{Push-down automaton}
This is abbreviated ``PDA''.
\begin{verse}
G = (Q, Σ, Γ, δ, q₀, F)
\end{verse}
% Explain each symbol
% This is closely related to such-and-such kind of language.
\subsubsection{Grammar}
There are two main classes of grammar: context-free and
context-sensitive.
\begin{verse}
G = (V, Σ, R, S)
\end{verse}
% Explain each symbol
\subsubsection{Turing machine}
We abbreviate ``Turing machine'' by ``T.m.'' in the remainder of the
text.
\subsubsection{Enumerator}
% Explain how this relates to T.m., and (if time permits) how to
% define one in terms of a T.m.
This is a device that prints all of the strings in a particular
language. Repeating a string is permitted.
It may be defined in terms of a T.m., but that is not covered here.
\subsubsection{Oracle}
This is a device that can immediately determine (in constant time)
whether a given string is part of a language.
\subsubsection{Oracle Turing machine}
This is a T.m. that uses an oracle. It is denoted $M^B$, where $B$ is the
language for which an oracle exists.
\subsubsection{Verifier}
This is an algorithm (or rather, a T.m. written in the form of an
algorithm), that given a string and a language, accepts iff. the
string (called a ``certificate'') is part of the language. The
language is an implied parameter, not an actual one. We usually refer
to a specific verifier as ``The verifier for $L$ '', where $L$ is a
language.
\subsubsection{Circuits}
\begin{itemize}
\item Uniform circuit family
\item $A$ ∈ TIME($t(n)$) → $A$ has circuit complexity $O(t(n)^2)$
\item size-depth circuit complexity
\end{itemize}
\subsection{Techniques}
\subsubsection{Mapping reducibility}
This is denoted $L \leq_m M$. To understand what this means, we must first
define a \emph{computable function}.
\begin{description}
\item[computable function] \mbox{}
A function $f$ that for some T.m. $T$, given an
input $w$ for some T.m. $M$, leaves the input $f(w)$ on the tape.
$f(w)$ will be composed of the symbols in the input alphabet for $M$.
In short,
\begin{itemize}
\item No new symbols are produced.
\item Some transformation is done on $w$.
\end{itemize}
\end{description}
Now we can define $L \leq_m M$ as: $L$ is mapping reducible to $M$ if there is a
computable function $f$ such that for every $w$ in $A$, $f(w)$ is in $B$. Also,
for every $w'$ not in $A$, $f(w')$ is not in $B$. This may be written more
succinctly as follows.
\begin{verse}
$w$ ∈ $L$ ⇔ $f(w)$ ∈ $M$
\end{verse}
Whenever the textbook states that ``$L$ reduces to $M$'' or ``show a
reduction from $L$'', they are referring to mapping reducibility, often
with an implicit time or space constraint, as discussed in later
techniques.
\subsubsection{Polynomial-time reducibility}
This is denoted $L \leq_p M$, and means that $L$ is mapping reducible
to $M$, with the added restriction that this reduction must occur in
polynomial time.
This is used to prove NP-completeness of a language, or that a
language is in P.
\subsubsection{Turing reducibility}
This is denoted $L \leq_T M$, and means that if an oracle for the language $M$
existed, we can write an oracle T.m. that decides language $L$. In
other words, we use a ``subroutine'' for $M$ in the definition of a T.m
for $L$.
We can also say that language $L$ is ``Turing reducible'' to language $M$.
If we relate this to programming, we could even say that a routine for
$L$ ``depends on'' a routine for $M$.
Annoyingly, this notation is backwards compared to the direction for
mapping reducibility. In mapping reducibility, we construct a
function that takes the input from the left side and transforms it
into input for the right side. In Turing reducibility, however, we
construct a T.m. for the left side given a T.m for the right side.
We can use Turing reducibility to determine whether a language is
``decidable relative to'' another language.
\begin{description}
\item[Theorem] \mbox{}
If $L \leq_T M$, and $M$ is decidable, then $L$ is decidable.
\item[We also know] \mbox{}
If $L \leq_m M$, then $L \leq_T M$.
The reverse is not necessarily true.
\end{description}
\subsubsection{Legal windows}
% i.e. lethal windows ...
These are groups of cells in a tableau which can represent a valid
transition from one configuration to another.
\begin{description}
\item[Warning]
This definition may be inaccurate and in need of revision. The
authors of this guide do not particularly like this mechanism, and
have not given it much effort.
\end{description}
\subsubsection{Log space reducibility}
This is denoted $LL \leq_L M$. and means that $LL$ is mapping reducible
to $M$, with the added restriction that this reduction must occur in
log space.
This is used to prove NL-completeness of a language, or that a
language is in L.
\clearpage
\section{Problem-solving}
\subsection{Prove language is not regular}
We will use the pumping lemma to prove that a language is not regular.
First, we assume that the language is regular, and then use the
pumping lemma to arrive at a string that \emph{should} be part the
language, but isn't. This is a contradiction, so the language must
not have been regular to begin with.
\begin{lemma}
If $A$ is regular then $\exists p$ such that if $s \in A, |s| \geq p$ then $s
= xyz$ such that
\begin{enumerate}\addtolength{\itemsep}{-2ex}
\item
$\forall i \geq 0, xy^iz \in A$,
\item
$|y| > 0$, and
\item
$|xy| \leq p$.
\end{enumerate}
\end{lemma}
Note: Just because you can pump a language does not mean that it must
be regular.
\subsection{Find decider for a language}
Not yet written.
\subsection{Proving language is undecidable}
% Ugh! This one is absurdly difficult due to the tossing of names and
% waving of hands.
% Uses Turing reduction from A_TM
General strategy for some language $B$:
\begin{enumerate}
\item
Suppose $B$ is decidable by some machine, $M$.
\item
Show some problem $A$ that is undecidable is reducible (in any time
and space) to $B$ by some machine that employs $M$ (usually in some
devious manner).
\item
Perform obligatory hand waving.
\item
Since $A$ is undecidable and we provided a decider for it that employs
$M$ our assumption must be garbage.
\item
$B$, therefore, in all its glory, is undecidable as well.
\end{enumerate}
Select $A$ from the sweet set mentioned in~\ref{sec:undecidable}
e.g. HALT$_{TM}$ is reducible to A$_{TM}$, therefore HALT$_{TM}$ is
undecidable.
\subsection{Prove language is not context-free}
We will use an expanded variant of the pumping lemma.
\begin{lemma}
If $A$ is a CFL then $\exists p$ such that if $s \in A, |s| \geq p$ then $s
= uvxyz$ such that
\begin{enumerate}\addtolength{\itemsep}{-2ex}
\item
$\forall i \geq 0, uv^ixy^iz \in A$,
\item
$|vy| > 0$, and
\item
$|vxy| \leq p$.
\end{enumerate}
\end{lemma}
Note: Just because you can pump a language does not mean that it must
be a CFL.
\subsection{Proving language is NP-complete}
In order to prove that some language $L$ is NP-complete, we must do the
following.
\begin{enumerate}
\item Show that $L$ ∈ NP.
The most popular way to do this is to use nondeterministic
guessing of input, coupled with polynomial time verification of our
guesses. We could explicitly give an algorithm, but often a simple
paragraph suffices.
\item Show that some language known to be NP-complete is polynomial-time
reducible to $L$. We'll call this second language $M$. The
notation for this is $M \leq_p L$.
If we wanted to, instead of doing the rest of this proof, we could
show that $L$ is NP-hard (in other words, that any language can be
reduced using polynomial time to $L$). This is usually more tedious
than doing the following, however.
\begin{itemize}
\item This is done by showing a polynomial-time mapping reduction from
$L$ to $MM$. Namely, write out a computable function $f$ that
takes an instance of $MM$ (or equivalently, the ``input'' of $MM$) and
produces an instance of $L$ (the ``input'' of $L$), doing this in
polynomial time.
\item We must also prove that when an instance/input of $L$ yields
\textbf{true}\footnote{In other words, that a machine for $L$ would \underline{accept} the input.}, the corresponding instance/input for $M$ also yields
\textbf{true}, and vice versa.
Start by assuming the input of $L$ produces a \textbf{true} result, and
then show that $f$ produces input for $M$ which must yield a
\textbf{true} result.
Then show the reverse direction -- namely, assume the input of $M$
produces a \textbf{true} result, and then show that $f$ produces input
for $L$ which must yield a \textbf{true} result.
\end{itemize}
\item Since we have shown $MM \leq_p L$, and we know $A \leq_p MM$ for every
language $A$ ∈ NP, $A \leq_p MM \leq_p L$. This completes our proof, since we have
previously shown that $L$ ∈ NP.
\end{enumerate}
\subsection{Proving language in PSPACE}
In order to prove that some language $L$ is in PSPACE, we must do the
following.
\begin{itemize}
\item Define one or more machines that together decide the language in
polynomial space (that is, $O(n^k)$, $k$ ∈ $\mathbb{N}$).
\item Provide analysis of space usage of these machines so that it is
determined to be $O(n^k)$ at worst case, specifying $k$.
\end{itemize}
\subsection{Proving language in L}
In order to prove that some language $LL$ is in $L$, we must provide a
machine that decides $LL$ as follows.
\begin{itemize}
\item The machine that we make (call it $M$), is provided with a read-only
input tape of length $n$, and a work tape that is $O(\log{n})$.
\item One common method for reducing our space usage is to use counters
(numbers that are incremented and decremented) and store them in
binary encoding to our work tape. This encoding only takes up
$O(\log{t})$ space, as opposed to storing an equivalent sequence of
symbols which has length $t$.
\item We can use time-wasteful algorithms that we would normally pass
over, as long as they only use logarithmic space.
\end{itemize}
\subsection{Proving language is NL-complete}
In order to prove that some language $LL$ is NL-complete, we must do the
following.
\begin{enumerate}
\item Show that $LL$ ∈ NL.
The most popular way to do this is to use nondeterministic
guessing of input, coupled with log space verification of our guesses.
We could explicitly give an algorithm, but often a simple paragraph
suffices.
\item Show that some language known to be NL-complete is log-space
reducible to $LL$. We'll call this second language $MM$. The
notation for this is $MM \leq_L LL$.
If we wanted to, instead of doing the rest of this proof, we could
show that $LL$ is NL-hard (in other words, that any language can be
reduced to $LL$ using log space). This is usually more tedious than
doing the following, however.
\begin{itemize}
\item This is done by showing a logarithmic-space mapping reduction
from $LL$ to $MM$. Namely, write out a computable function\footnote{This is actually called a "log space transducer", which is a very
strange name for such a simple device.} $f$ that
takes an instance of $MM$ (or equivalently, the ``input'' of $MM$) and
produces an instance of $LL$ (the ``input'' of $LL$), using log space
to do so.
\item We must also prove that when an instance/input of $LL$ yields
\textbf{true}\footnote{In other words, that a machine for $LL$ would \underline{accept} the input.}, the corresponding instance/input for $MM$ also yields
\textbf{true}, and vice versa.
Start by assuming the input of $LL$ produces a \textbf{true} result, and
then show that $f$ produces input for $MM$ which must yield a
\textbf{true} result.
Then show the reverse direction -- namely, assume the input of $MM$
produces a \textbf{true} result, and then show that $f$ produces input
for $LL$ which must yield a \textbf{true} result.
\end{itemize}
\item Since we have shown $MM \leq_L LL$, and we know $A \leq_L MM$ for every
language $A$ ∈ NL, $A \leq_L MM \leq_L LL$. This completes our proof, since we have
previously shown that $LL$ ∈ NL.
\end{enumerate}
\end{document}