Algorithm and Complexity 2, année 2024

The class NP

Definition

The class $NP$ is the non-deterministic version of $P$, that is, the set of languages that can be decided in non-deterministic polynomial time.

More formally, $\textrm{P}=\bigcup_{d}\textrm{NTIME}(n^d)$.

As we will see in this lecture and the next, it is a robust class containing many natural problems.

The P vs NP problem

It is not known whether $P \neq NP$ and it is one of the major open question in theoretical computer science. While it is obvious that $P \subseteq NP$, it is widely believed that $NP \not \subseteq P$. We will see later that using the notion of NP-completeness, this conjecture is used as evidence of the non-existence of a polynomial time algorithm for some problems.

Examples (Graph problems)

Given an integer $k$ and a graph $G$, does it exist a clique of size $k$ in $G$?
$3$-coloring.
Minimal Vertex Cover.

Certificates

Alternative defintion using certificates: a language $L$ is in NP if there exists a polynomial $p$ and a DTM $M$ running in polynomial time such that $x \in L$ if and only if there exists $y \in \Sigma^{p(|x|)}$, called the certificate, such that $M(x\#y)$ accepts.

Important:

$y$ has polynomial size in $|x|$.
$M(x\#y)$ halts in time $poly(|x|+|y|)$ (but it is also polynomial in $|x|$ from the previous fact).
One can see $y$ as an help to guess.

Proving problems are in NP

The previous characterization of NP is more useful in practice when one tries to prove that a language is in NP. Many languages in are asking to check whether some problem has a solution:

Examples:

COMPOSITE asks whether one can find a divisor of a number.
$3$-COLOR asks whether one can find a $3$-coloring of a graph.

Such problems are often in NP. To prove this, it is sufficient to prove the following:

Given a candidate solution, one should be able to verify whether it is an actual solution of the problem in polynomial time.
If a solution exists to the problem, there must exist one of size polynomial in the input size.

Exercise: use this approach to prove that CLIQUE is NP.

Tricky problem: ILP.

Reductions and NP-completeness

Complexity Theory is heavily based on the notion of reductions between problems. Informally, a decision problem $A$ reduces to a decision problem $B$ if whenever we have an algorithm for $B$, then we can use it to solve problem $A$.

Many-one Reduction

Let $L,K$ be languages. A many-one reduction from $L$ to $K$ is a function $f : \Sigma^* \to \Sigma^*$ such that for every $x \in \Sigma^*$, $x \in K$ if and only if $f(x) \in L$.

If there exists a many-one reduction from $L$ to $K$, we say that $K$ is many-one reducible to $L$ and denote it by $K \leq L$.

If moreover $f$ is polynomial time computable, we say that $K$ is Karp-reducible to $L$ and denote it by $K \leq_p L$.

Intuitively, it means that $K$ is easier than $L$, in the sense that if we know how to solve $L$, we directly get an algorithm for $K$.

Example

k-CLIQUE reduces to k-BICLIQUE.

NP-hardness and NP-completeness

A language $L$ is NP-hard if for every $K \in NP$, $K$ is Karp-reducible to $L$. A language $L$ is $NP$-complete if it is NP-hard and if $L \in NP$.

Theorem (Cook-Levin, 71): There exist an NP-complete problem.

NP-completeness is a central notion in complexity theory. Indeed, by definition, if one can find a polynomial time algorithm for an NP-complete problem, then $P = NP$. Proving a problem to be NP-complete is thus strong evidence that it does not admit a polynomial time algorithm (or if it were, it will be a major breakthrough).

As it turns out, many natural problems are NP-complete. The hard lifting was done by Cook and Levin to exhibit the first natural NP-complete problem, namely SAT. Once one has an NP-complete problem $L$, to show that a problem $K$ is NP-complete, it is sufficient to show that $K \in NP$ and exhibit a reduction from $L$ to $K$.

Lemma: If $L$ is NP-complete, $K \in NP$ and $L \leq_p K$ then $K$ is NP-complete.

Exercise session

Show that $IS$ is in $NP$ and that $IS \leq_p CLIQUE$ and $CLIQUE \leq_p IS$ where $CLIQUE(G,k)$ is the problem of deciding whether there exists a clique of size at least $k$ in $G$ and $IS(G,k)$ is the problem of deciding whether there exists an independent set of size at least $k$ in $G$.
Show that $CLIQUE \leq_p VERTEX-COVER$ where $VERTEX-COVER(G,k)$ is the problem of deciding whether a graph $G$ has a vertex cover of size at most $k$ (a set $S$ of vertices such that every edge of $G$ contains at least one vertex from $S$).

Hint: if $S$ is a vertex-cover of $G$, what can you say about $V \setminus S$?
Show that $VERTEX-COVER \leq_P 01-ILP$ where $01$-ILP is the problem of deciding whether there exists a vector $x$ with entries in $\{0,1\}$ such that given an integer matrix $C$ and an integer vector $d$, $Cx \geq d$.

Hint: See the vertices of $G$ as variables in the ILP.
Show that $VERTEX-COVER \leq_P HAMILTON$ where $HAMILTON$ is the problem of deciding whether given a graph $G$, there exists an Hamiltonian path in the graph (ie, a graph going through every vertex exactly once)?
Does PRIME reduces to COMPOSITE? Related: is PRIME in NP? (Yes but tricky).

The problem SAT

First we define what is a boolean formula as a formal mathematical object. We do it by induction.

Let $X$ be a set of countably many indexed variable $\{x_0, x_1,\ldots,\}$.

For any nonnegative $i$, $x_i$ is a boolean formula.

By induction, assume we have $\phi$ and $\psi$ which are boolean formulas, then the following formulas are boolean formulas:

$\phi \land \psi$ (reads $\phi \text{ and }\psi$)
$\phi \lor \psi$ (reads $\phi \text{ or }\psi$)
$\lnot \phi$ (reads $\text{ not }\phi$)

An evaluation is a function from $X$ to $\{0, 1\}$. Given a formula and an evaluation $E$, we define the operation $E \models \phi$ (reads $E\text{ models }\phi$) by induction as follow:

For any nonnegative $i$, $E\models x_i$ if $E(x_i)=1$.
$E \models \phi \land \psi$ if $E\models \phi$ and $E$\psi$
$E \models \phi \lor \psi$ if $E\models \phi$ or $E$\psi$
$E \models \lnot \psi$ if $E\not\models \psi$

A formula $\phi$ is satisfiable if their exists an evaluation $E$ such that $E\models \phi$.

The SAT problem take in input a boolean formula and accept it if it is satisfiable.

Theorem. SAT is NP-complete.

SAT is NP: guess an evaluation and check it is correct.
SAT is NP-hard: simulates any polynomial times NTM (hard, check the references book) by encoding the transition function as SAT constraint.

Exercise

Two formulas are said to be equivalent if they agree on all evaluations.

The EQUIVALENT problem take two boolean formulas in input and accept if they are equivalent.

Show that EQUIVALENT and SAT are co-Karp-reducible.

3-SAT

We are going to show that even a restriction of SAT is NP-complete.

We first define what it means for a boolean formulas to be in CNF (conjunctive normal forms).

A literal is either a variable ($x_i$) or its negation ($\lnot x_i$).
A clause is a union of literals: $C=\bigvee_i l_i$ where $l_i$ are literals
A boolean formulas is CNF if it is conjunction of clause: $\bigwedge_i C_i$ where $C_i$ are clauses.

Finally, a formulas is a $k$-CNF if all its literals have at most $3$ variables.

The problem 3-SAT is a restriction of SAT to boolean formulas which are $3$-CNF.

Proposition. Every boolean formulas is equivalent to a CNF-formulas.

Theorem. 3-SAT is NP-complete.

It is sufficient to show that we can reduce SAT to 3-SAT.

Theorem. 2-SAT is P.

On machine exercises

TODO

Plan of lecture:

Definition of Boolean formulas
Definition of evaluation of Boolean formulas
Definition of SAT problem

Theorem. (COOK LEVIN) SAT is NP-complete Proof sketch. representing sequence of configuration of a poly NTM with booleans formulas and checking correctness locally.

Definition of normal form for proposition logic:
- literals
- clauses
- (k-)CNF

Proposition. Every Boolean function is equivalent to a CNF Boolean formulas.

Definition k-SAT problem

Theorem. k-SAT is NP-complete, $k\geq 3$. Proof sketch. - Encoding a full formulas by materializing the formulas tree. - Building a 3-CNF formulas Psize from the initial one which is SAT iff the initial is SAT.

Theorem. 2-SAT is in P. Proof sketch. - Building the implication graph. - Check that no cycle frop $x$ going through $\lnot x$. - Assign to 0 all nodes $x$ such that their is a path $x$ to $\lnot x$. - Greedy algorithm assigning values to CFC in revese topological order.

Exercise Session

Read the Wikipédia page of sudoku
Fill the skeleton here (partial solution here).

Compiled the: ven. 05 sept. 2025 16:27:20 CEST