Algorithmes et Complexité 2, année 2023

Non-deterministic Turing Machine

A non-deterministic Turing machine (NTM) is a tuple \((Q, \Sigma, q_0, q_a, q_r, \delta_0, \delta_1)\) where:

Contrary to DTM, NTM have two transitions functions. It intuitively means that given a NTM \(M\) and a configuration \(c\) of \(M\), there are two possible configurations following \(c\) : \(\delta_0(c)\) and \(\delta_1(c)\).

The memory model is the same as in the original model of computation: a NTM is equipped with a bi-infinite tape of the form \(\_^\infty u \_^\infty\) and a head that can read and write on the tape. We describe configurations in the same way as for DTM.

We use the same notations as in the previous lecture, that is, we denote by \(\delta_0(c)\) the configuration following \(c\) by \(\delta_0\) (similarly for \(\delta_1(c)\)).

Executions

Similarly to DTM, the initial configuration of an NTM \(M\) on input \(u\) is \(I(u)=(q_0, u, 0)\), that is, the head is positionned at \(0\), the state is the initial state and the tape contains only \(u\).

However, starting from this initial state, a NTM may have several possible executions.

An execution of \(M\) on input \(u\) is a sequence of configurations \((c_i)\) such that \(c_0=I(u)\) and \(c_{i+1} \in \{\delta_0(c_i), \delta_1(c_i)\}\) if the state of \(c_i\) is not \(q_a\) nor \(q_r\).

The NTM \(M\) accepts the word \(u\) if and only if there exists an execution of \(M\) on input \(u\) that reaches \(q_a\). The NTM \(M\) rejects the word \(u\) if and only if every execution of \(M\) on input \(u\) reaches state \(q_r\).

An NTM \(M\) accepts a language \(L\) if \(L\) is the set of words accepted by \(M\). An NTM \(M\) rejects a language \(L\) if \(L\) is the set of words rejected by \(M\). An NTM \(M\) decides a language \(L\), denoted by \(L(M)\), if it accepts \(L\) and rejects \(\Sigma^* \setminus L\).

Examples

Variations of NTM

As for DTM, one could imagine variations of the previous model of computation. For example:

Both variant define the same notion of computation.

However, it is not obvious anymore how to define a variation of NTM which is able to compute a function and not only decide a language. Indeed, there may be several accepting executions that will write a different words on output tape.

For NTM, we will thus only consider decision problem.

Computational power of NTM

NTM and DTM have the same computational power, that is, for every language \(L\) that is recognized by a NTM \(M\), there exists a DTM \(M'\) such that \(L = L(M')\).

The idea of the proof is to simulate every possible executions of an NTM with a DTM . This can be done by encoding every possible configuration a machine \(M\) can reach after \(t\) steps (proof sketch on blackboard).

Complexity

Similarly to DTM, one can define a notion of complexity for NTM. Given an NTM \(M\), we say that it has a worst-case time complexity \(f\colon\mathbb{N}\to\mathbb{N}\) if for any word \(u\), the length of every execution of \(M\) on input \(u\) is less than \(f(|u|)\).

Given a language \(L\) and a function \(f\colon\mathbb{N}\to\mathbb{N}\), we say that \(L\) has a worst-case non-deterministic time complexity \(f\) if there exists an NTM \(M\) deciding \(L\) with worst-time complexity \(f\).

Given a function \(f\colon\mathbb{N}\to\mathbb{N}\), we denote by \(\textrm{NTIME}(f)\) the class of languages with worst-time non-deterministic complexity \(O(f)\).

Observe that contrary to DTIME, the notion of NTIME (and more generally the notion of languages decided by NTM) is not closed by complement. That is, if \(L \in \textrm{NTIME}(f)\), it does not mean the \(\Sigma^* \setminus L \in \textrm{NTIME}(f)\). Interchanging \(q_a\) and \(q_r\) does not work!

Non-deterministic time is, at first glance, more powerful than deterministic time, which is witnessed by this upper bound which is more or less the best that we know to this point:

Theorem: \(\textrm{DTIME(f)} \subseteq \textrm{NTIME(f)} \subseteq \textrm{DTIME}(2^{O(f)})\).

Non determinism in high level languages

Non-deterministic time complexity classes behave quite differently from what we expect from high level languages. This is mainly because processors are inately deterministic.

The notion of non-determinism could however be introduced in such language by adding a “guess” instruction:

b = GUESS()

In this case, variable \(b\) receives a value in \(\{0,1\}\). If used in a function returning either true or false, the value of the function is true if and only if there is a way of setting the different guesses so that the function returns \(1\).

In particular, it makes bruteforce quite easy!

See here for an actual way to program with non-determinism in Python.

Exercise

The class NP

Definition

The class \(NP\) is the non-deterministic version of \(P\), that is, the set of languages that can be decided in non-deterministic polynomial time.

More formally, \(\textrm{P}=\bigcup_{d}\textrm{NTIME}(n^d)\).

As we will see in this lecture and the next, it is a robust class containing many natural problems.

The P vs NP problem

It is not known whether \(P \neq NP\) and it is one of the major open question in theoretical computer science. While it is obvious that \(P \subseteq NP\), it is widely believed that \(NP \not \subseteq P\). We will see later that using the notion of NP-completeness, this conjecture is used as evidence of the non-existence of a polynomial time algorithm for some problems.

Examples (Graph problems)

Certificates

Alternative defintion using certificates: a language \(L\) is in NP if there exists a polynomial \(p\) and a DTM \(M\) running in polynomial time such that \(x \in L\) if and only if there exists \(y \in \Sigma^{p(|x|)}\), called the certificate, such that \(M(x\#y)\) accepts.

Important:

Proving problems are in NP

The previous characterization of NP is more useful in practice when one tries to prove that a language is in NP. Many languages in are asking to check whether some problem has a solution:

Examples:

Such problems are often in NP. To prove this, it is sufficient to prove the following:

Exercise: use this approach to prove that CLIQUE is NP.

Tricky problem: ILP.

Reductions and NP-completeness

Complexity Theory is heavily based on the notion of reductions between problems. Informally, a decision problem \(A\) reduces to a decision problem \(B\) if whenever we have an algorithm for \(B\), then we can use it to solve problem \(A\).

Many-one Reduction

Let \(L,K\) be languages. A many-one reduction from \(L\) to \(K\) is a function \(f : \Sigma^* \to \Sigma^*\) such that for every \(x \in \Sigma^*\), \(x \in K\) if and only if \(f(x) \in L\).

If there exists a many-one reduction from \(L\) to \(K\), we say that \(K\) is many-one reducible to \(L\) and denote it by \(K \leq L\).

If moreover \(f\) is polynomial time computable, we say that \(K\) is Karp-reducible to \(L\) and denote it by \(K \leq_p L\).

Intuitively, it means that \(K\) is easier than \(L\), in the sense that if we know how to solve \(L\), we directly get an algorithm for \(K\).

Example

NP-hardness and NP-completeness

A language \(L\) is NP-hard if for every \(K \in NP\), \(K\) is Karp-reducible to \(L\). A language \(L\) is \(NP\)-complete if it is NP-hard and if \(L \in NP\).

Theorem (Cook-Levin, 71): There exist an NP-complete problem.

NP-completeness is a central notion in complexity theory. Indeed, by definition, if one can find a polynomial time algorithm for an NP-complete problem, then \(P = NP\). Proving a problem to be NP-complete is thus strong evidence that it does not admit a polynomial time algorithm (or if it were, it will be a major breakthrough).

As it turns out, many natural problems are NP-complete. The hard lifting was done by Cook and Levin to exhibit the first natural NP-complete problem, namely SAT. Once one has an NP-complete problem \(L\), to show that a problem \(K\) is NP-complete, it is sufficient to show that \(K \in NP\) and exhibit a reduction from \(L\) to \(K\).

Lemma: If \(L\) is NP-complete, \(K \in NP\) and \(L \leq_p K\) then \(K\) is NP-complete.

Exercise session

  1. Show that \(IS\) is in \(NP\) and that \(IS \leq_p CLIQUE\) and \(CLIQUE \leq_p IS\) where \(CLIQUE(G,k)\) is the problem of deciding whether there exists a clique of size at least \(k\) in \(G\) and \(IS(G,k)\) is the problem of deciding whether there exists an independent set of size at least \(k\) in \(G\).

  2. Show that \(CLIQUE \leq_p VERTEX-COVER\) where \(VERTEX-COVER(G,k)\) is the problem of deciding whether a graph \(G\) has a vertex cover of size at most \(k\) (a set \(S\) of vertices such that every edge of \(G\) contains at least one vertex from \(S\)).

    Hint: if \(S\) is a vertex-cover of \(G\), what can you say about \(V \setminus S\)?

  3. Show that \(VERTEX-COVER \leq_P 01-ILP\) where \(01\)-ILP is the problem of deciding whether there exists a vector \(x\) with entries in \(\{0,1\}\) such that given an integer matrix \(C\) and an integer vector \(d\), \(Cx \geq d\).

    Hint: See the vertices of \(G\) as variables in the ILP.

  4. Show that \(VERTEX-COVER \leq_P HAMILTON\) where \(HAMILTON\) is the problem of deciding whether given a graph \(G\), there exists an Hamiltonian path in the graph (ie, a graph going through every vertex exactly once)?

  5. Does PRIME reduces to COMPOSITE? Related: is PRIME in NP? (Yes but tricky).

The problem SAT

First we define what is a boolean formula as a formal mathematical object. We do it by induction.

Let \(X\) be a set of countably many indexed variable \(\{x_0, x_1,\ldots,\}\).

By induction, assume we have \(\phi\) and \(\psi\) which are boolean formulas, then the following formulas are boolean formulas:

An evaluation is a function from \(X\) to \(\{0, 1\}\). Given a formula and an evaluation \(E\), we define the operation \(E \models \phi\) (reads \(E\text{ models }\phi\)) by induction as follow:

A formula \(\phi\) is satisfiable if their exists an evaluation \(E\) such that \(E\models \phi\).

The SAT problem take in input a boolean formula and accept it if it is satisfiable.

Theorem. SAT is NP-complete.

Exercise

Two formulas are said to be equivalent if they agree on all evaluations.

The EQUIVALENT problem take two boolean formulas in input and accept if they are equivalent.

Show that EQUIVALENT and SAT are co-Karp-reducible.

3-SAT

We are going to show that even a restriction of SAT is NP-complete.

We first define what it means for a boolean formulas to be in CNF (conjunctive normal forms).

Finally, a formulas is a \(k\)-CNF if all its literals have at most \(3\) variables.

The problem 3-SAT is a restriction of SAT to boolean formulas which are \(3\)-CNF.

Proposition. Every boolean formulas is equivalent to a CNF-formulas.

Theorem. 3-SAT is NP-complete.

It is sufficient to show that we can reduce SAT to 3-SAT.

Theorem. 2-SAT is P.

On machine exercises

TODO

Plan of lecture:

Theorem. (COOK LEVIN) SAT is NP-complete Proof sketch. representing sequence of configuration of a poly NTM with booleans formulas and checking correctness locally.

Proposition. Every Boolean function is equivalent to a CNF Boolean formulas.

Theorem. k-SAT is NP-complete, \(k\geq 3\). Proof sketch. - Encoding a full formulas by materializing the formulas tree. - Building a 3-CNF formulas Psize from the initial one which is SAT iff the initial is SAT.

Theorem. 2-SAT is in P. Proof sketch. - Building the implication graph. - Check that no cycle frop \(x\) going through \(\lnot x\). - Assign to 0 all nodes \(x\) such that their is a path \(x\) to \(\lnot x\). - Greedy algorithm assigning values to CFC in revese topological order.

Exercise Session


Compiled the: lun. 06 mai 2024 15:04:03 CEST