Basi di dati

Sapienza università di Roma, Dipartimento di Informatica

Schema

An attribute is a (name,domain)(\text{name}, \text{domain}) pair; we can define the dom()dom() function on a set of names, which associates to each name a specific domain (different attributes can have the same domain)

dom:{name1,...,namen}{domain1,...,domaink}nameidomainj\begin{align*} dom : \set{\text{name}_1, ..., \text{name}_n} & \to \set{\text{domain}_1, ..., \text{domain}_k}\\ \text{name}_i &\mapsto \text{domain}_j \end{align*}

PDF 7 slide 2

A relation schema R={A1,A2,...,An}R = \set{A_1, A_2, ..., A_n} is a set of attributes

Sapienza università di Roma, Dipartimento di Informatica

Tuples & instances

PDF 7 slide 3 Given a relation schema R=A1A2...AnR = A_1 A_2 ... A_n, a tuple tt on RR is a function such that

t:Ri=1ndom(Ai)Aiadom(Ai)\begin{align*} t : R &\to \bigcup\limits_{i = 1}^n dom(A_i)\\ A_i &\mapsto a \in dom(A_i) \end{align*}

Given a relation schema RR, a subset XRX \subseteq R and tt a tuple on RR, the reduction of tt on XX is defined as

t[X]={(A,t[A])AX}t[X] = \set{(A, t[A]) \mid A \in X}

PDF 7 slide 4 Given a relation schema RR, a subset XRX \subseteq R and t1,t2t_1, t_2 tuples on RR

t1[X]=t2[X]    t1[A]=t2[A]  AXt_1[X] = t_2[X] \iff t_1[A] = t_2[A] \; \forall A \in X

PDF 7 slide 5 Given a relation schema RR and t1,t2,...,tkt_1, t_2, ..., t_k tuples on RR, a set r={t1,t2,...,tk}r = \set{t_1, t_2, ..., t_k} is an instance of RR

Sapienza università di Roma, Dipartimento di Informatica

Functional dependencies

PDF 7 slide 6

Given a relation schema RR and X,YP(R){}X, Y \in \mathcal{P}(R) \setminus \set{\varnothing} we have that (X,Y)(X, Y) is a functional dependency on RR (noted as XYX \to Y)

PDF 7 slide 7

Given a relation schema RR and a functional dependency XYX \to Y defined on RR we say that an instance rr of RR satisfies the functional dependency XYX \to Y if

t1,t2rt1[X]=t2[X]    t1[Y]=t2[Y]\forall t_1, t_2 \in r \quad t_1[X] = t_2[X] \implies t_1[Y] = t_2[Y]

Sapienza università di Roma, Dipartimento di Informatica

Instance legality & closure

PDF 7 slide 14

Given a relation schema RR and a set FF of functional dependencies defined on RR, an instance rr of RR is legal if it satisfies every dependency in FF

XYFt1,t2rt1[X]=t2[X]    t1[Y]=t2[Y]\forall X \to Y \in F \quad \forall t_1, t_2 \in r \quad t_1[X] = t_2[X] \implies t_1[Y] = t_2[Y]

PDF 7 slide 20

Given a relation schema RR and a set FF of functional dependencies defined on RR, the closure of FF is the set of functional dependencies that are satisfied by every legal instance of RR

F+={VW legal r of R,r satisfies VW}F^+ = \set{V \to W \mid \forall \text{ legal } r \text{ of } R, r \text{ satisfies } V \to W}

VWV \to W doesn't necessarily have to be in FF

Sapienza università di Roma, Dipartimento di Informatica

FF+F \subseteq F^+

PDF 7 slide 21

FF+F \subseteq F^+

Proof

F+={VW legal r of R,r satisfies VW}F^+ = \set{V \to W \mid \forall \text{ legal } r \text{ of } R, r \text{ satisfies } V \to W}

By definition rr is legal if it satisfies every dependency XYF    X \to Y \in F \implies given XYFX \to Y \in F, every legal instance of RR satisfies XY    XYF+X \to Y \implies X \to Y \in F^+

Sapienza università di Roma, Dipartimento di Informatica

Keys

PDF 7 slide 22

Given a relation schema RR and a set FF of functional dependencies on RR, KRK \subseteq R is a key of RR if

  • KRF+K \to R \in F^+
  • KK,KRF+\forall K' \subset K, \: K' \to R \notin F^+

"""\subset" means proper subset, which implies that KKK \neq K'

Sapienza università di Roma, Dipartimento di Informatica

Trivial dependencies

PDF 7 slide 26

Given a schema RR and X,YP(R){}:YXX, Y \in \mathcal{P}(R) \setminus \set{\varnothing} : Y \subseteq X, we have that every instance rr of RR satisfies the dependency XYX \to Y

Proof

Given an instance rr of R,  t1,t2rR, \; \forall t_1, t_2 \in r we have that

t1[X]=t2[X]    t_1[X] = t_2[X] \implies by definition t1[A]=t2[A]  AX    t_1[A] = t_2[A] \; \forall A \in X \implies as YXY \subseteq X we have that
t1[A]=t2[A]  AY    t_1[A'] = t_2[A'] \; \forall A' \in Y \implies by definition t1[Y]=t2[Y]t_1[Y] = t_2[Y]

As t1[X]=t2[X]    t1[Y]=t2[Y]t_1[X] = t_2[X] \implies t_1[Y] = t_2[Y] we have that rr satisfies XYX \to Y

Sapienza università di Roma, Dipartimento di Informatica

Decomposition

PDF 7 slide 27

Given a schema RR and a set of functional dependencies FF on RR, we have that

XYF+    XAF+  AYX \to Y \in F^+ \iff X \to A \in F^+ \; \forall A \in Y

Proof

XYF+     legal r of Rt1,t2rt1[X]=t2[X]    t1[Y]=t2[Y]    t1[A]=t2[A]  AY    XAF+  AYX \to Y \in F^+ \implies \forall \text{ legal } r \text{ of } R \quad \forall t_1, t_2 \in r \quad t_1[X] = t_2[X] \implies t_1[Y] = t_2[Y] \implies t_1[A] = t_2[A] \; \forall A \in Y \implies X \to A \in F^+ \; \forall A \in Y

XAF+  AY     legal r of Rt1,t2Rt1[X]=t2[X]    t1[A]=t2[A]  AY    t1[Y]=t2[Y]    XYF+X \to A \in F^+ \; \forall A \in Y \implies \forall \text{ legal } r \text{ of } R \quad \forall t_1, t_2 \in R \quad t_1[X] = t_2[X] \implies \\ t_1[A] = t_2[A] \; \forall A \in Y \implies t_1[Y] = t_2[Y] \implies X \to Y \in F^+

Sapienza università di Roma, Dipartimento di Informatica

FAF^A

PDF 8 slide 3 FAF^A is a set of functional dependencies on RR such that

  • XYF    XYFAX \to Y \in F \implies X \to Y \in F^A
  • YXR    XYFAY \subseteq X \in R \implies X \to Y \in F^A (refelxivity)
  • ZR,XYFA    ZXZYFA\forall Z \in R, X \to Y \in F^A \implies ZX \to ZY \in F^A (augmentation)
  • XY,YZFA    XZFAX \to Y, Y \to Z \in F^A \implies X \to Z \in F^A (transitivity)

PDF 8 slide 6 derivates

  • XY,XZFA    XYZFAX \to Y, X \to Z \in F^A \implies X \to YZ \in F^A (union)
  • XYFAZY    XZFAX \to Y \in F^A \land Z \subseteq Y \implies X \to Z \in F^A (decomposition)
  • XY,WYZFA    WXZFAX \to Y, WY \to Z \in F^A \implies WX \to Z \in F^A (pseudotransitivity)

PDF 8 slide 8 XA1A2...AnFA    i=1,...,nXAiFAX \to A_1 A_2 ... A_n \in F^A \iff \forall i=1,...,n \quad X \to A_i \in F^A

Sapienza università di Roma, Dipartimento di Informatica

Derivates (Proofs)

Union

XY,XZFA    X \to Y, X \to Z \in F^A \implies by augmentation XXY,XYYZFA    X \to XY, XY \to YZ \in F^A \implies by transitivity XYZFAX \to YZ \in F^A

Decomposition

XYFAZY    YZFA    X \to Y \in F^A \land Z \subseteq Y \implies Y \to Z \in F^A \implies by transitivity XZFAX \to Z \in F^A

Pseudotransitivity

XY,WYZFA    X \to Y, WY \to Z \in F^A \implies by augmentation WXWYFA    WX \to WY \in F^A \implies by transitivity WXZFAWX \to Z \in F^A

Sapienza università di Roma, Dipartimento di Informatica

(X)F+(X)^+_F

PDF 8 slide 9

Given a relation schema RR, a set FF of dependencies on RR and XRX \subseteq R. The closure of XX with respect to FF, denoted (X)F+(X)^+_F is defined as

(X)F+={ARXAFA}(X)^+_F = \set{A \in R \mid X \to A \in F^A}

We have that X(X)F+X \subseteq (X)^+_F

Proof

AX\forall A \in X by reflexivity XAFA    X \to A \in F^A \implies by definition A(X)F+    X(X)F+A \in (X)^+_F \implies X \subseteq (X)^+_F

We can use Armstrong's axioms as (X)F+(X)^+_F is defined of FAF^A

NOTE: $(X)

Sapienza università di Roma, Dipartimento di Informatica

Lemma of closure

PDF 8 slide 10

Let RR be a schema and FF a set of functional dependencies on RR

XYFA    Y(X)F+X \to Y \in F^A \iff Y \subseteq (X)^+_F

Proof

XYFA    X \to Y \in F^A \implies by decomposition XAFA  AY    X \to A \in F^A \; \forall A \in Y \implies by definition
A(X)F+  AY    Y(X)F+A \in (X)^+_F \; \forall A \in Y \implies Y \subseteq (X)^+_F

Y(X)F+    A(X)F+  AY    Y \subseteq (X)^+_F \implies A \in (X)^+_F \; \forall A \in Y \implies by definition XAFA  AY    X \to A \in F^A \; \forall A \in Y \implies by union
XYFAX \to Y \in F^A

Sapienza università di Roma, Dipartimento di Informatica

F+=FAF^+ = F^A

PDF 8 slide 11

Let RR be a relation schema and FF a set of functional dependencies on RR then F+=FAF^+ = F^A

Proof

Let FiF_i be the value of FF after the ii-th application of an Armstrong's axiom, with F0=FF_0 = F

Sapienza università di Roma, Dipartimento di Informatica

FAF+F^A \subseteq F^+

Base case

F0=FF+    F0F+F_0 = F \subseteq F^+ \implies F_0 \subseteq F^+

Inductive step

FiF+    Fi+1F+F_i \subseteq F^+ \implies F_{i + 1} \subseteq F^+

Let XYFi+1X \to Y \in F_{i + 1}, either

  • XYFi    X \to Y \in F_i \implies by HP XYF+X \to Y \in F^+
  • XYFi+1FiX \to Y \in F_{i + 1} \setminus F_i, which means that XYX \to Y has been optained through one of the axioms
Sapienza università di Roma, Dipartimento di Informatica

FAF+F^A \subseteq F^+

Reflexivity

YX    Y \subseteq X \implies given that XYX \to Y is satisfied by every instance XYF+X \to Y \in F^+

Augmentation

ZR,X=ZV,Y=ZWVWFiZ \subseteq R, X = ZV, Y = ZW \land V \to W \in F_i given t1,t2rt_1, t_2 \in r legal instance of RR we have that t1[X]=t2[X]    (t1[V]=t2[V]    t_1[X] = t_2[X] \implies (t_1[V] = t_2[V] \implies by HP t1[W]=t2[W])t1[Z]=t2[Z]    t1[Y]=t2[Y]t_1[W] = t_2[W]) \land t_1[Z] = t_2[Z] \implies t_1[Y] = t_2[Y]

Transitivity

XZ,ZYFi    X \to Z, Z \to Y \in F_i \implies by HP  legal r of R,t1,t2r,t1[X]=t2[X]    t1[Z]=t2[Z]    t1[Y]=t2[Y]\forall \text{ legal } r \text{ of } R, \forall t_1, t_2 \in r, t_1[X] = t_2[X] \implies t_1[Z] = t_2[Z] \implies t_1[Y] = t_2[Y] we have that t1[X]=t2[X]    t1[Y]=t2[Y]    XYF+t_1[X] = t_2[X] \implies t_1[Y] = t_2[Y] \implies X \to Y \in F^+

Sapienza università di Roma, Dipartimento di Informatica

F+FAF^+ \subseteq F^A (legal instance)

Given XRX \subseteq R we can build an instance r={t1,t2}r = \set{t_1, t_2} on RR such that

rr

(X)F+(X)^+_F

R(X)F+R \setminus (X)^+_F

t1t_1

1 1 1 ... 1 1 1 1 ... 1

t2t_2

1 1 1 ... 1 0 0 0 ... 0

Let's verify that rr is a legal instance. Given VWFV \to W \in F, as V,WV, W \neq \varnothing by definition, we could have

  • V(X)F+    AV:AR(X)F+    t1[V]t2[V]    rV \nsubseteq (X)^+_F \implies \exists A \in V : A \in R \setminus (X)^+_F \implies t_1[V] \neq t_2[V] \implies r satisfies VWV \to W
  • V(X)F+V \subseteq (X)^+_F, we could have that
    • W(X)F+    t1[V]=t2[V]t1[W]=t2[W]    rW \subseteq (X)^+_F \implies t_1[V] = t_2[V] \land t_1[W] = t_2[W] \implies r satisfies VWV \to W
    • W(X)F+    AW:AR(X)F+    t1[V]=t2[V]t1[W]t2[W]W \nsubseteq (X)^+_F \implies \exists A \in W : A \in R \setminus (X)^+_F \implies t_1[V] = t_2[V] \land t_1[W] \neq t_2[W]
Sapienza università di Roma, Dipartimento di Informatica

F+FAF^+ \subseteq F^A (legal instance)

In the last case rr doesn't satisfy VWV \to W, so we have to show that it can't happen. Let's suppose that VWF\exists V \to W \in F such that rr doesn't satisfy VWV \to W; by construction we have that

V(X)F+AW:AR(X)F+    A(X)F+V \subseteq (X)^+_F \land \exists A \in W : A \in R \setminus (X)^+_F \implies A \notin (X)^+_F

We have that

  • V(X)F+    V \subseteq (X)^+_F \implies by the lemma of closure XVFAX \to V \in F^A
  • AW    A \in W \implies by decomposition VAFAV \to A \in F^A

By transitivity XAFA    X \to A \in F^A \implies by the lemma of closure A(X)F+A \in (X)^+_F which is a contraddiction

Legality

In the first 2 cases rr satisfies VWFV \to W \in F, case 3 can't happen     r\implies r is a legal instance of RR

Sapienza università di Roma, Dipartimento di Informatica

F+FAF^+ \subseteq F^A

Let's consider XYF+X \to Y \in F^+

By definition we have that X(X)F+    X \subseteq (X)^+_F \implies by construction t1[X]=t2[X]    t_1[X] = t_2[X] \implies by hypotesis and given that rr is a legal instance t1[Y]=t2[Y]    t_1[Y] = t_2[Y] \implies by the lemma Y(X)F+    XYFAY \subseteq (X)^+_F \implies X \to Y \in F^A

F+=FAF^+ = F^A

Given that FiF+iNF_i \subseteq F^+ \: \forall i \in \mathbb{N} and F+FAF^+ \subseteq F^A we have that F+=FAF^+ = F^A

Sapienza università di Roma, Dipartimento di Informatica

3NF

PDF 9 slide 14

Given a relation schema RR and a set of functional dependencies FF on RR.

RR is in 3NF if XAF+:AX\forall X \to A \in F^+ : A \notin X either

  • AA is prime (belongs to a key)
  • XX is superkey
Sapienza università di Roma, Dipartimento di Informatica

3NF pt.2

PDF 10 slide 4

Let RR be a relation schema and FF a set of functional dependencies on RR

An attribute ARA \in R partially depends on a key KK if

  • XR:AXXAFXK\exists X \subset R : A \notin X \land X \to A \in F \land X \subset K
  • AA isn't prime

An attribute ARA \in R transitively depends on a key KK if

  • XR:AXXAFKXF\exists X \subset R : A \notin X \land X \to A \in F \land K \to X \in F
  • XX isn't a key
  • AA isn't prime

XRX \subset R means that XRX \neq R, otherwise X would be a superkey, as RRFA=F+R \to R \in F^A = F^+

Sapienza università di Roma, Dipartimento di Informatica

3NF pt.3

PDF 10 slide 5

Given a schema RR and a set of functional dependencies FF on RR, TFAE

  • RR is in 3NF
  • there are no attributes that partially or transitively depend on a key
  • XAF+:AX\forall X \to A \in F^+ : A \notin X either:
    • AA is prime (belongs to a key)
    • XX is superkey

Proof

TODO: I have it, I just have to write it out in LaTeX\LaTeX

Sapienza università di Roma, Dipartimento di Informatica

BCNF (Boyce-Codd)

PDF 10 slide 20

A relation schema RR is in Boyce-Codd Normal Form (BCNF) when every determinant in FF is a superkey. A relation that respects Boyce-Codd Normal Form is also in 3NF, but the opposite is not true.

Sapienza università di Roma, Dipartimento di Informatica

(X)F+(X)^+_F

PDF 11 slie 5

def clousure(R, F, X):
	Z = X
	S = { A ∈ R | Y → V ∈ F ∧ Y ⊆ Z ∧ A ∈ V }

	if S ⊆ Z:
		return Z

	return closure(R, F, Z ∪ S)
Sapienza università di Roma, Dipartimento di Informatica

(X)F+(X)^+_F

PDF 11 slide 8

The algorithm closure() correctly computes the closure of a set of attributes XX respectively to a set FF of functional dependencies on RR

Proof

Let's consider Zi,SiZ_i, S_i the values of ZZ and SS at the ii-th call of the function and Zf,SfSfZfZ_f, S_f \mid S_f \subseteq Z_f the values of Z,SZ, S at the last call of the function. Let's prove by induction that Zi(X)F+Z_i \subseteq (X)^+_F

Sapienza università di Roma, Dipartimento di Informatica

Zi(X)F+Z_i \subseteq (X)^+_F

Base case

Z0=X(X)F+Z_0 = X \subseteq (X)^+_F

Inductive step Zi(X)F+    Zi+1(X)F+Z_i \subseteq (X)^+_F \implies Z_{i + 1} \subseteq (X)^+_F

Given that Zi+1=ZiSiZ_{i + 1} = Z_i \cup S_i then if AZi+1A \in Z_{i + 1} either

  • AZi    A \in Z_i \implies by HP A(X)F+A \in (X)^+_F
  • ASi    A \in S_i \implies by construction YVFYZiAV    \exists Y \to V \in F \mid Y \subseteq Z_i \land A \in V \implies by HP Y(X)F+    Y \subseteq (X)^+_F \implies by the lemma of closure XYFAX \to Y \in F^A and by decomposition YAFA    Y \to A \in F^A \implies by transitivity XAFA    X \to A \in F^A \implies by definition A(X)F+A \in (X)^+_F
Sapienza università di Roma, Dipartimento di Informatica

(X)F+Zf(X)^+_F \subseteq Z_f (legal instance)

Given ZfZ_f we can build an instance r={t1,t2}r = \set{t_1, t_2} on RR such that

rr

ZfZ_f

RZfR \setminus Z_f

t1t_1

1 1 1 ... 1 1 1 1 ... 1

t2t_2

1 1 1 ... 1 0 0 0 ... 0

Let's verify that rr is a legal instance. Given VWFV \to W \in F as V,WV, W \neq \varnothing we could have either

  • VZf    AV:ARZf    t1[V]t2[V]    rV \nsubseteq Z_f \implies \exists A \in V : A \in R \setminus Z_f \implies t_1[V] \neq t_2[V] \implies r satisfies VWV \to W
  • VZfV \subseteq Z_f
    • WZf    W \subseteq Z_f \implies by construction t1[V]=t2[V]t1[W]=t1[W]    t_1[V] = t_2[V] \land t_1[W] = t_1[W] \implies rr satisfies VWV \to W
    • WZf    W \nsubseteq Z_f \implies by construction t1[V]=t2[V]t1[W]t2[W]t_1[V] = t_2[V] \land t_1[W] \neq t_2[W]
Sapienza università di Roma, Dipartimento di Informatica

(X)F+Zf(X)^+_F \subseteq Z_f (legal instance)

Let's suppose that VWF:r\exists V \to W \in F : r doesn't satisfy VW    V \to W \implies by construction

VZfAW:ARZf    AZfV \subseteq Z_f \land \exist A \in W : A \in R \setminus Z_f \implies A \notin Z_f

Given that VZfVWFAW    V \subseteq Z_f \land V \to W \in F \land A \in W \implies by construction of Sf,AZfS_f, \: A \in Z_f which is a contraddiction

Legality

In the first 2 cases rr satisfies VWFV \to W \in F case 3 can't happen     r\implies r is a legal instance of RR

Sapienza università di Roma, Dipartimento di Informatica

(X)F+Zf(X)^+_F \subseteq Z_f

Let's consider A(X)F+A \in (X)^+_F

Given that XAFA=F+X \to A \in F^A = F^+ and rr is a legal instance     r\implies r satisfies XYX \to Y, and given that by construction XZf    t1[X]=t2[X]    X \subseteq Z_f \implies t_1[X] = t_2[X] \implies by definition t1[A]=t2[A]    t_1[A] = t_2[A] \implies by construction AZfA \in Z_f

Zf=(X)F+Z_f = (X)^+_F

Given that Zi(X)F+  iNZ_i \subseteq (X)^+_F \; \forall i \in \mathbb{N} and (X)F+Zf(X)^+_F \subseteq Z_f, we have that Zf=(X)F+Z_f = (X)^+_F

Sapienza università di Roma, Dipartimento di Informatica

Intersection Rule

PDF 12 slide 19

Given a relation scheme RR and a set of functional dependencies FF on RR

X:=VWFR(WV)X := \bigcap\limits_{V \to W \in F} R-(W-V)

If XRF+X \to R \in F^+ then the intersection is the only key to RR otherwise there are multiple keys, and ALL of them must be identified to check if RR is in 3NF

Sapienza università di Roma, Dipartimento di Informatica

Decomposition

PDF 13 slide 8

Let RR be a relation scheme, a decomposition ρ\rho of RR is such that

ρ={R1,R2,...,Rk}P(R):i=1kRi=R\rho = \set{R_1, R_2, ..., R_k} \subseteq \mathcal{P}(R) : \bigcup\limits_{i = 1}^{k} R_i = R

Sapienza università di Roma, Dipartimento di Informatica

Equivalence

PDF 13 slide 12

Let FF and GG be two sets of functional dependencies, we can define an equivalence relation

FG    F+=G+F \equiv G \iff F^+ = G^+

  • reflexivity F    F+=F+    FFF \implies F^+ = F^+ \implies F \equiv F
  • simmetry FG    F+=G+    G+=F+    GFF \equiv G \implies F^+ = G^+ \implies G^+ = F^+ \implies G \equiv F
  • transitivity FGGH    F+=G+G+=H+    F+=H+    FHF \equiv G \land G \equiv H \implies F^+ = G^+ \land G^+ = H^+ \implies F^+ = H^+ \implies F \equiv H

PDF 13 slide 14

Let FF and GG be two sets of functional dependencies

FG+    F+G+F \subseteq G^+ \implies F^+ \subseteq G^+

Sapienza università di Roma, Dipartimento di Informatica

FG+    F+G+F \subseteq G^+ \implies F^+ \subseteq G^+

Base case

F0=FG+    F0G+F_0 = F \subseteq G^+ \implies F_0 \subseteq G^+

Inductive Step

FiG+    Fi+1G+F_i \subseteq G^+ \implies F_{i + 1} \subseteq G^+

XYFi+1    XYX \to Y \in F_{i + 1} \implies X \to Y has been optained through

  • reflexivity YX    Y \subseteq X \implies given that XYX \to Y is satisfied by every instance XYG+X \to Y \in G^+
  • augmentation ZR,VWFiX=ZV,Y=ZW\exists Z \subseteq R, V \to W \in F_i \mid X = ZV, Y = ZW
  • transitivity

TODO

Sapienza università di Roma, Dipartimento di Informatica

Preserving F

PDF 13 slide 15

Let RR be a relation scheme, FF a set of functional dependencies on RR and ρ={R1,R2,...,Rk}\rho = \set{R_1, R_2, ..., R_k} a decomposition of RR, we say that ρ\rho preseves FF if

FG=i=1kπRi(F)F \equiv G = \bigcup\limits_{i = 1}^{k} \pi_{R_i}(F)

Where

πRi(F)={XYF+XYRi}\pi_{R_i}(F) = \set{X \to Y \in F^+ \mid XY \subseteq R_i}

PDF 13 slide 16

Given the definition of GG, it will always be that GF+    G+F+G \subseteq F^+ \implies G^+ \subseteq F^+ so it is enough to verify that FG+F \subseteq G^+

Sapienza università di Roma, Dipartimento di Informatica

Dependency preservation

PDF 13 slide 17

def preserves_dependencies(R, F, ρ):
	for X → Y ∈ F:
		if Y ⊈ closure_G(R, F, ρ, X):
			return false

	return true

This algorithm is enough as we just have to check wether FG+F \subseteq G^+

Given XYFX \to Y \in F we have that XYG+=GA    X \to Y \in G^+ = G^A \iff by the lemma of closure Y(X)G+Y \subseteq (X)^+_G

Sapienza università di Roma, Dipartimento di Informatica

(X)G+(X)^+_G

def clousure_G(R, F, X, ρ):
	Z = X
	S = ∅

	for P ∈ ρ:
		S = S ∪ (closure(R, F, Z ∩ P) ∩ P)

	if S ⊆ Z
		return Z

	return closure_G(R, F, Z ∪ S)

PDF 13 slide 23 Let RR be a relation schema, FF a set of functional dependencies on RR and
ρ={R1,R2,...,Rk}\rho = \set{R_1, R_2, ..., R_k} a decomposition of RR and XRX \subseteq R the algorithm closure_G() correctly computes (X)G+(X)^+_G

Sapienza università di Roma, Dipartimento di Informatica

Zf(X)G+Z_f \subseteq (X)^+_G

Let Zi,SiZ_i, S_i the values of ZZ and SS at the ii-th call of the function, with Z0=XZ_0 = X, and SfZfS_f \subseteq Z_f

Base case

Z0=X(X)G+    Z0(X)G+Z_0 = X \subseteq (X)^+_G \implies Z_0 \subseteq (X)^+_G by HP

Inductive step

Zi(X)G+    Zi+1(X)G+Z_i \subseteq (X)^+_G \implies Z_{i+1} \subseteq (X)^+_G, given that Si=j=1k(ZiRj)F+RjS_i = \bigcup\limits_{j = 1}^k (Z_i \cap R_j)^+_F \cap R_j

Let AZi+1=ZiSi    j:A(ZiRj)Rj    ZiRjAGAA \in Z_{i + 1} = Z_i \cup S_i \implies \exists j : A \in (Z_i \cap R_j) \cap R_j \implies Z_i \cap R_j \to A \in G^A

By HP we have that Zi(X)G+    XZiGAZ_i \subseteq (X)^+_G \implies X \to Z_i \in G^A, let Zi=(ZiRj)VZ_i = (Z_i \cap R_j) \cup V by decomposition we have that XZiRjGA    X \to Z_i \cap R_j \in G^A \implies by transitivity XAGAX \to A \in G^A

Sapienza università di Roma, Dipartimento di Informatica

XY    (X)F+(Y)F+X \subseteq Y \implies (X)^+_F \subseteq (Y)^+_F

XY    YXFAX \subseteq Y \implies Y \to X \in F^A by reflexivity

Given A(X)F+    A \in (X)^+_F \implies by the lemma of closure XAFA    X \to A \in F^A \implies by transitivity YAFA    Y \to A \in F^A \\ \implies by the lemma of closure A(Y)F+A \in (Y)^+_F

(X)G+Zf(X)^+_G \subseteq Z_f

XZf    (X)G+(Zf)G+X \subseteq Z_f \implies (X)^+_G \subseteq (Z_f)^+_G, we have to prove that Zf=(Zf)G+Z_f = (Z_f)^+_G

Let's consider AS={ARVWGVZfAW}    VWG:VZfAW    Rjρ:VWRj    VZfRjARj    A(ZfRj)F+Rj    ASf    AZfA \in S' = \set{A \in R \mid V \to W \in G \land V \subseteq Z_f \land A \in W} \implies \exists V \to W \in G : V \subseteq Z_f \land A \in W \implies \exists R_j \in \rho : VW \subseteq R_j \implies V \subseteq Z_f \cap R_j \land A \in R_j \implies A \in (Z_f \cap R_j)^+_F \cap R_j \implies A \in S_f \implies A \in Z_f

Sapienza università di Roma, Dipartimento di Informatica

Loseless join

PDF 15 slide 11 Let RR be a relation schema. A decomposition ρ={R1,R2,...,Rk}\rho = \set{R_1, R_2, ..., R_k} of RR has a lossless join if r\forall r legal instance of RR we have that r=πR1(r)πR2(r)...πRk(r)r = \pi_{R_1}(r) \bowtie \pi_{R_2}(r) \bowtie ... \bowtie \pi_{R_k}(r)

PDF 15 slide 13 Let RR be a relation schema and let ρ={R1,R2,...,Rk}\rho = \set{R_1, R_2, ..., R_k} be a decomposition of RR; for each legal instance rr of RR, we denote mρ(r)=πR1(r)πR2(r)...πRk(r)m_{\rho}(r) = \pi_{R_1}(r) \bowtie \pi_{R_2}(r) \bowtie ... \bowtie \pi_{R_k}(r)

  • rmρ(r)r \subseteq m_{\rho}(r)
  • πRi(mρ(r))=πRi(r)\pi_{R_i}(m_{\rho}(r)) = \pi_{R_i}(r)
  • mρ(mρ(r))=mρ(r)m_{\rho}(m_{\rho}(r)) = m_{\rho}(r)

Given S1,...,SkS_1, ..., S_k relation schemas with their instances s1,...,sks_1, ..., s_k, let's define the \bowtie operator as

i=1kSi={i=1ktjsi  tjsi  i=1ktj is a function}\mathop{\bowtie}\limits_{i = 1}^k S_i = \set {\bigcup\limits_{i = 1}^k t_j \mid \forall s_i \; \forall t_j \in s_i \; \land \bigcup\limits_{i = 1}^k t_j \text{ is a function}}

Sapienza università di Roma, Dipartimento di Informatica

rmρ(r)r \subseteq m_{\rho}(r)

tr    t[Ri]πRi(r)  Riρt \in r \implies t[R_i] \in \pi_{R_i}(r) \; \forall R_i \in \rho by definition

i=1kπRi(r)={i=1kpi[Ri]pi[Ri]πRi(r)i=1kpi[Ri] is a function}\mathop{\bowtie}\limits_{i = 1}^k \pi_{R_i}(r) = \set {\bigcup\limits_{i = 1}^k p_i[R_i] \mid p_i[R_i] \in \pi_{R_i}(r) \land \bigcup\limits_{i = 1}^k p_i[R_i] \text{ is a function}}

tr,  t=i=1kt[Ri]\forall t \in r, \; t = \bigcup\limits_{i = 1}^k t[R_i] as by definition of ρ\rho we have that R=i=1kRiR = \bigcup\limits_{i = 1}^k R_i

tr    tt \in r \implies t is a function by definition

t=i=1kt[Ri]i=1kπRi(r)=mρ(r)    tmρ(r)t = \bigcup\limits_{i = 1}^k t[R_i] \in \mathop{\bowtie}\limits_{i = 1}^k \pi_{R_i}(r) = m_{\rho}(r) \implies t \in m_{\rho}(r)

Sapienza università di Roma, Dipartimento di Informatica

πRi(mρ(r))=πRi(r)\pi_{R_i}(m_{\rho}(r)) = \pi_{R_i}(r)

tr    t \in r \implies by definition t[Ri]πRi(r)  Riρt[R_i] \in \pi_{R_i}(r) \; \forall R_i \in \rho
πRi(mρ(r))={q[Ri]qi=1kπRi(r)}\pi_{R_i}(m_{\rho}(r)) = \set{q[R_i] \mid q \in \mathop{\bowtie}\limits_{i = 1}^k \pi_{R_i}(r)}

πRi(r)πRi(mρ(r))\pi_{R_i}(r) \subseteq \pi_{R_i}(m_{\rho}(r))

tr    tmρ(r)    t[Ri]πRi(mρ(r))t \in r \implies t \in m_{\rho}(r) \implies t[R_i] \in \pi_{R_i}(m_{\rho}(r))

πRi(mρ(r))πRi(r)\pi_{R_i}(m_{\rho}(r)) \subseteq \pi_{R_i}(r)

qi=1kπRi(r)    q \in \mathop{\bowtie}\limits_{i = 1}^k \pi_{R_i}(r) \implies by definition of join q=i=1k{pi[Ri]}pir    q = \mathop{\bowtie}\limits_{i = 1}^k \set{ p_i[R_i] } \mid p_i \in r \implies given that qq is a function q[Ri]=pi[Ri]q[R_i] = p_i[R_i] and pir    pi[Ri]πRi(r)p_i \in r \implies p_i[R_i] \in \pi_{R_i}(r) we have that q[Ri]πRi(r)q[R_i] \in \pi_{R_i}(r)

Sapienza università di Roma, Dipartimento di Informatica

mρ(mρ(r))=mρ(r)m_{\rho}(m_{\rho}(r)) = m_{\rho}(r)

mρ(mρ(r))=i=1kπRi(mρ(r))=i=1kπRi(r)=mρ(r)m_{\rho}(m_{\rho}(r)) = \mathop{\bowtie}\limits_{i = 1}^k \pi_{R_i}(m_{\rho}(r)) = \mathop{\bowtie}\limits_{i = 1}^k \pi_{R_i}(r) = m_{\rho}(r)

Sapienza università di Roma, Dipartimento di Informatica

Loseless join pt.2

PDF 15 slide 15 Given ρ={R1,R2,...,Rk}\rho = \set{R_1, R_2, ..., R_k}, build a table rr with R|R| columns and kk rows. At the ii-th row and jj-th column put aja_j if ARiA \in R_{i} else bijb_{ij}

def has_looseless_join(R, F, ρ):
	while !(∃ t ∈ r | ∀ A ∈ R, t[A] = a) and r changed:
		for X → Y ∈ F:
			for t1 ∈ r:
				for t2 ∈ r:
					if t1[X] = t2[X] and t1[Y] != t2[Y]:
						for A ∈ Y:
							if t1[A] = a:
								t2[A] = t1[A]
							else:
								t1[A] = t2[A]

	return ∃ t ∈ r | ∀ A ∈ R, t[A] = a
Sapienza università di Roma, Dipartimento di Informatica

Correctness

PDF 15 slide 19

Let RR be a relation scheme, FF a set of functional dependencies on RR and let ρ={R1,R2,...,Rk}\rho = \set{R_1, R_2, ..., R_k} be a decomposition of RR; the algorithm correctly decides whether ρ\rho has a lossless join

r=mρ(r)    rr = m_{\rho}(r) \iff r has a tuple with all aa when the algorithm termintes

TODO: I can prove r=mρ(r)    rr = m_{\rho}(r) \implies r has a tuple with all aa when the algorithm terminates, I just have to write it in LaTeX\LaTeX

Sapienza università di Roma, Dipartimento di Informatica

Minimal cover

PDF 17 slide 7

Let RR be a schema and FF be a set of functional dependencies on RR. A minimal cover of FF is a set of functional dependencies GFG \equiv F such that:

  • XYG,Y=1\forall X \to Y \in G, |Y| = 1
  • XAG,XXG(G{XA}){XA}\forall X \to A \in G, \nexists X' \subset X \mid G \equiv (G - \set{X \to A}) \cup \set{X' \to A}
  • XAGGG{XA}\nexists X \to A \in G \mid G \equiv G - \set{X \to A}
Sapienza università di Roma, Dipartimento di Informatica

Minimal cover (step 1)

F1={XAXYFAY}F_1 = \set{X \to A \mid X \to Y \in F \land A \in Y}

FAF1F \overset{A}{\to} F_1 by decomposition F1AF1A    FF1AF_1 \overset{A}{\to} F_1^A \implies F \subseteq F_1^A

F1AFF_1 \overset{A}{\to} F by union FAFA    F1FAF \overset{A}{\to} F^A \implies F_1 \subseteq F^A

FF1F \equiv F_1

Sapienza università di Roma, Dipartimento di Informatica

Minimal cover (step 2)

Given XAF1,XXXAF1+    F2=(F1{XA}){XA}X \to A \in F_1, X' \subset X \land X' \to A \in F_1^+ \implies F_2 = (F_1 \setminus \set{X \to A}) \cup \set{X' \to A}

XX    XXF1+XXF2+X' \subseteq X \implies X \to X' \in F_1^+ \land X \to X' \in F_2^+ by reflexivity

XAF1X \to A \in F_1

  • XAF2    XAF2+X \to A \in F_2 \implies X \to A \in F_2^+
  • XAF2    XXF2+XAF2+    XAF2+X \to A \notin F_2 \implies X \to X' \in F_2^+ \land X' \to A \in F_2^+ \implies X \to A \in F_2^+ by transitivity

XAF2X \to A \in F_2

  • XAF1    XAF1+X \to A \in F_1 \implies X \to A \in F_1^+
  • XAF1    XAF1+X \to A \notin F_1 \implies X \to A \in F_1^+ by HP

F2F1    FF2F_2 \equiv F_1 \implies F \equiv F_2 by transitivity of the \equiv relationship

Sapienza università di Roma, Dipartimento di Informatica

Minimal cover (step 3)

XAF2,  A(X)F2{XA}+    F3=F2{XA}X \to A \in F_2, \; A \in (X)^+_{F_2 \setminus \set{X \to A}} \implies F_3 = F_2 \setminus \set{X \to A}

XAF2X \to A \in F_2

  • XAF3    XAF3+X \to A \in F_3 \implies X \to A \in F_3^+
  • XAF3    XAF3+X \to A \notin F_3 \implies X \to A \in F_3^+ by HP as A(X)F3+A \in (X)^+_{F_3}

XAF3X \to A \in F_3

  • XAF2    XAF2+X \to A \in F_2 \implies X \to A \in F_2^+
  • XAF2X \to A \notin F_2 is a contraddiction as F3=F2{XA}F_3 = F_2 \setminus \set{X \to A} by definition

F2F3    FF3F_2 \equiv F_3 \implies F \equiv F_3

Sapienza università di Roma, Dipartimento di Informatica

Decomposition

def decomposition(R, F: minimal cover):
	S = ∅
	ρ = ∅

	for A ∈ R | ∄ X → Y ∈ F : A ∈ XY:
		S = S ∪ {A}

	if S != ∅:
		R = R - S
		ρ = ρ ∪ {S}

	if ∃ X → Y ∈ F | XY = R:
		ρ = ρ ∪ {R}
	else:
		for X → A ∈ F:
			ρ = ρ ∪ {XA}
Sapienza università di Roma, Dipartimento di Informatica

Decomposition pt.2

PDF 19 slide 5

Let RR be a relational schema and FF a set of functional dependencies on RR, which is a minimal cover; the algorithm decomposition() computes (in polynomial time) a decomposition ρ\rho of RR such that:

  • each relational schema in ρ\rho is in 3NF
  • ρ\rho preserves FF
Sapienza università di Roma, Dipartimento di Informatica

https://www.compart.com/en/unicode/U+2288

TODO: relational algebra

TODO: define domain

TODO: define dom : names -> Domains

TODO: define attribute

$$ t : R \to \bigcup\limits_{i = 1}^n dom(A_i) : A_i \mapsto a \in dom(A_i) $$

If $X \subseteq R$ and $t_1$ and $t_2$ are two tuples on $R$, $t_1$ and $t_2$ coincide on $X$ $(t_1[X] = t_2[X])$ if $\forall A \in X t_1[A] = t_2[A]$

If $X \subseteq R$ and $t_1$ and $t_2$ are two tuples on $R$, if $\forall A \in X, t_1[A] = t_2[A] \implies t_1[X] = t_2[X]$

Given a relation schema $R$, a functional dependency on $R$ is an ordered pair of non-empty subsets X, Y ⊆ R

Given a relation schema $R$ and $X, Y \subseteq R : X \neq \varnothing \land Y \neq \varnothing \implies (X, Y) = X \to Y$ is a functional dependency on $R$

> NOTE: the definition of functional dependency doesn't depend on the concept of instance, as a functional dependency is just an ordered pair of non-empty subsets of $R$; it's important to understand the concept that $r$ may or may not satisfy $X \to Y$

it satisfies all dependencies in $F$

# How big is $F^+$? Given a relation schema $R$ such that $|R| = n$, let's consider the definition of functional dependency: we need $X, Y \in \mathcal{P}(R) \setminus \varnothing$, given that $|\mathcal{P}(R)| = 2^n$ we could have $2^n - 1$ choices for $X$ and $Y$. Given that we could pair the subsets however we want, we have $(2^n - 1)(2^n - 1) = 2^{2n} - 2\cdot 2^n + 1 = \\ =2^{2n} - 2^{n + 1} + 1$ combinations Let's consider the smallest possible $F = \varnothing$, how many values does $F^+$ have? ---

By definition an instance $r$ of $R$ satisfies a dependency $X \to Y$ if $\forall t_1, t_2 \in r, t_1[X] = t_2[X] \implies t_1[Y] = t_2[Y]$

$X \to Y, X \to Z \in F^A \implies X \to YZ \in F^A$

TODO check if subseteq? NOT needed because X is a proper subset of K

PDF 10 slide 6

Let $R$ be a relation schema and $F$ a set of functional dependencies on $R$. A schema $R$ is in 3NF if and only if neither partial dependencies nor transitive dependencies exist in $F$

> TODO: example that show that not all relation schemas can be reduced to 3NF

$Z_0 = X \subseteq (X)^+_F \implies Z_0 \subseteq (X)^+_F$

### Inductive step

We have to prove that $Z_i \subseteq (X)^+_F \implies Z_{i + 1} \subseteq (X)^+_F$

and $S_i = \set{A \in R \mid Y \to V \in F \land Y \subseteq Z_i \land A \in V}$

If the intersection of these sets determines $R$, then the intersection is the only key to $R$ else there are multiple keys, and ALL of them must be identified for checking 3NF

S = { A ∈ R | Y → V ∈ F ∧ Y ⊆ Z ∧ A ∈ V }

# return X if S ⊆ X else closure(R, F, X ∪ S)

PDF 13 slide 19

return X if S ⊆ X else closure_G(R, F, X ∪ S)

the final values of $Z$ and $S$

# $X \subseteq Y \implies (X)^+_F \subseteq (Y)^+_F$ _(alternative)_ Let $X_i, Y_i$ the values of the clousure after the $i$-th application of the algorithm `closure()` ### Base case $X_0 = X \subseteq Y = Y_0$ ### Inductive step $X_i \subseteq Y_i \implies X_{i + 1} \subseteq Y_{i + 1}$ $A \in X_{i + 1}$, given that $X_{i + 1} = X_i \cup S^X_i \land S^X_i = \set{A \mid V \to W \in F \land V \subseteq X_i \land A \in W} \implies$ by HP $V \subseteq Y_i \implies W \subseteq S^Y_i \implies$ by decomposition, and given that $Y_{i + 1} = Y_i \cup S^Y_i \implies A \in Y_{i + 1}$ ---

If we decompose a relation schema $R$ we want the obtained decomposition $\rho = \set{R_1, R_2, ..., R_k}$ such that each legal instance $r$ of $R$ can be reconstructed through natural join from legal instnaces $r_1, r_2, ..., r_k$ ρ = R1 , R2 , ..., Rk , such that each legal instance r of R can be reconstructed through natural join from the legal instances r1 , r2 , ..., rk of the decomposition schemas R1 , R2 , ..., Rk as for reconstructing a tuple t of r it is required that t[Ri ] ∈ ri , ∀ i = 1,...,k, then we must have πRi(r) = ri , ∀ i = 1,...,k

$t \in r \implies t[R_i] = \set{(A, t[A]) \mid A \in R_i}$

Let's define the $\bowtie$ operator $R \bowtie S = \set{t_1 \cup t_2 \mid t_1 \in r \land t_2 \in s \land t_1 \cup t_2 \text{ is a function}}$

$\forall t \in r, \; t = \bigcup\limits_{i = 1}^k t[R_i] = \mathop{\bowtie}\limits_{i = 1}^k \set{ t[R_i] }$

$t \in r \implies$ by definition $t[R_i] \in \pi_{R_i}(r) \; \forall R_i \in \rho \implies \set{ t[R_i] } \subseteq \pi_{R_i}(r) \; \forall R_i \in \rho$

$t = \mathop{\bowtie}\limits_{i = 1}^k \set{ t[R_i] } \subseteq \mathop{\bowtie}\limits_{i = 1}^k \pi_{R_i}(r) = m_{\rho}(r) \implies t \in m_{\rho}(r)$

$\forall t \in r$ we consider $t[R_i], \: R_i \in \rho$, we have that $t \in \set{ t[R_1] } \bowtie ... \bowtie \set{ t[R_k] } \subseteq \pi_{R_1}(r) \bowtie ... \bowtie \pi_{R_k}(r) = m_{\rho}(r)$

Let's consider $t_{R_i} \in \pi_{R_i}(m_{\rho}(r))$, we have to prove that $t_{R_i} \in \pi_{R_i}(r)$

$\exists p_1, p_2, ..., p_k \in r \mid p_i[R_i] \in \pi_{R_i}(r) \; \forall i = 1,..., k$

$t_{R_i} \in \pi_{R_i}(m_{\rho}(r)) \iff \exists t' \in m_{\rho}(r) : t'[R_i] = t_{R_i} \iff$ $\iff \exists t_1, ..., t_k \in r : t'[R_j] = t_j[R_j] \quad \forall R_j \in \rho$ but $t_{R_i} = t[R_i] \in \pi_{R_i}(r)$

$m_{\rho}(m_{\rho}(r)) = \pi_{R_1}(m_{\rho}(r)) \bowtie ... \bowtie \pi_{R_k}(m_{\rho}(r)) = \pi_{R_1}(r) \bowtie ... \bowtie \pi_{R_k}(r) = m_{\rho}(r)$

$m_{\rho}(m_{\rho}(r)) = \pi_{R_1}(m_{\rho}(r)) \bowtie ... \bowtie \pi_{R_k}(m_{\rho}(r)) = \pi_{R_1}(r) \bowtie ... \bowtie \pi_{R_k}(r) = m_{\rho}(r)$

Let's consider $t_{R_i} \in \pi_{R_i}(r)$, we have to prove that $t_{R_i} \in \pi_{R_i}(m_{\rho}(r))$

Let's consider $t_{R_i} \in \pi_{R_i}(r) \land t' \in r$ with $t'[R_i]$, we have that $t'[R_i] \in t_{R_i}$

TODO: check if minimal cover is equivalent to F

[PDF 19 slide 4](19%20-%20decomposition%20algorithm.pdf#page=4)

--- # Physical organization - Heap - Sequential - Random (hashing) - Indexed Sequential - List Data - Secondary Indexes _(and inverted files)_ - $B$-Tree - $B^+$-tree --- # Hark Disk ## Reading cost **service time** = seek time + rotational delay + service time response time = queueing time + **service time** A physical organization can optimize record lookup, as we can read from the disk either **randomly** or **sequentially** _(which is faster, as we don't have to account for seek time)_ --- # Heap New records are inserted at the end of the file Linear search is the only option for records retrieval The `linear_search()` function has $O(\frac{\#Blk}{2})$ average complexity The `insert()` function has $O(1)$ complexity --- # Sequential Records are sorted based on key The `binary_search()` function has $O(log_2(\#Blk))$ complexity with random access The `linear_search()` function has $O(\frac{\#Blk}{2})$ complexity with sequential access The `insert()` function has $O(n \cdot log_2(n))$ complexity > TODO: Slide 20 --- # Hashing `hash(key) -> address` function, it uses buckets. Collitions can occur. `%` is kinda cool, better if $n$ is prime **Loading Factor**: average number of records per bucket divided by bucket size - tradeoff between efficient use of storage capacity and retrieval performance # ISAM (Indexed Sequential Access Method) It's basically sequential file organization with an index before > TODO research on loading factor Slide 40 --- # Concorrenza [PDF 22 slide 5](22%20-%20Il%20controllo%20della%20concorrenza.pdf#page=5) **Transazione** - esecuzione di una parte di un programma che rappresenta un’unità logica di accesso o modifica del contenuto della base di dati [PDF 22 slide 6](22%20-%20Il%20controllo%20della%20concorrenza.pdf#page=6) **ACID** - Atomicity, Consistency, Isolation e Durability [PDF 22 slide 7](22%20-%20Il%20controllo%20della%20concorrenza.pdf#page=7) **Schedule** - (piano di esecuzione) di un insieme $T$ di transazioni ordinamento delle operazioni nelle transazioni in $T$ che conserva l’ordine che le operazioni hanno all’interno delle singole transazioni [PDF 22 slide 8](22%20-%20Il%20controllo%20della%20concorrenza.pdf#page=8) **Schedule seriale** - schedule ottenuto permutando le transazioni in T --- # Schedule [PDF 22 slide 15](22%20-%20Il%20controllo%20della%20concorrenza.pdf#page=15) Tutti gli schedule seriali sono corretti. Uno schedule non seriale è corretto se è serializzabile, cioè se è “equivalente” ad uno schedule seriale. **Possibili problemi** - aggiornamento perso - dato sporco - aggregato non corretto [PDF 22 slide 19](22%20-%20Il%20controllo%20della%20concorrenza.pdf#page=19) Due schedule sono **equivalenti** se (per ogni dato modificato) producono valori uguali, dove due valori sono uguali solo se sono prodotti dalla stessa sequenza di operazioni --- # Item [PDF 22 slide 24](22%20-%20Il%20controllo%20della%20concorrenza.pdf#page=24) **Item** - unità a cui l’accesso è controllato Le **dimensioni** degli item devono essere definite in base all’uso che viene fatto della base di dati in modo tale che in media una transazione acceda a pochi item _(granularità)_ # Lock [PDF 23 slide 2](23%20-%20Il%20meccanismo%20di%20lock%20–%20Lock%20binario.pdf#page=2) **Lock** - privilegio di accesso ad un singolo item realizzato mediante una variabile associata all’item (variabile lucchetto) il cui valore descrive lo stato dell’item rispetto alle operazioni che possono essere effettuate su di esso --- # Lock pt.2 [PDF 23 slide 6](23%20-%20Il%20meccanismo%20di%20lock%20–%20Lock%20binario.pdf#page=6) Uno schedule è detto **legale** se una transazione effettua un locking ogni volta che deve leggere o scrivere un item ciascuna transazione rilascia ogni lock che ha ottenuto [PDF 23 slide 31](23%20-%20Il%20meccanismo%20di%20lock%20–%20Lock%20binario.pdf#page=31) Uno schedule è serializzabile se esiste uno schedule seriale tale che per ogni item l’ordine in cui le varie transazioni fanno un lock su quell’item coincide con quello dello schedule seriale [PDF 23 slide 32](23%20-%20Il%20meccanismo%20di%20lock%20–%20Lock%20binario.pdf#page=32) Algoritmo 1 Dato uno schedule S • Passo 1 • crea un grafo diretto G (grafo di serializzazione) nodi: transazioni archi: Ti Tj (con etichetta X) se in S Ti esegue un unlock(X) e Tj esegue il successivo lock(X) non UN successivo ma IL successivo, cioè Tj è la prima transazione che effettua il lock di X dopo che Ti ha effettuato l’unlock, anche se le due operazioni non sono di seguito --- # Serializzabilità [PDF 23 slide 38](23%20-%20Il%20meccanismo%20di%20lock%20–%20Lock%20binario.pdf#page=38) Uno schedule $S$ è serializzabile $\iff$ il suo grafo di serializzazione è aciclico [PDF 23 slide 42](23%20-%20Il%20meccanismo%20di%20lock%20–%20Lock%20binario.pdf#page=42) Una transazione obbedisce al protocollo di locking a due fasi, o più semplicemente è a due fasi, se - prima effettua tutte le operazioni di lock (fase di locking) e - poi tutte le operazioni di unlock (fase di unlocking) [PDF 23 slide 43](23%20-%20Il%20meccanismo%20di%20lock%20–%20Lock%20binario.pdf#page=43) Sia $T$ un insieme di transazioni. Se ogni transazione in $T$ è a due fasi $\implies$ ogni schedule di $T$ è serializzabile --- # Lock a 3 valori [PDF 24 slide 7](24%20-%20Lock%20a%20tre%20valori.pdf#page=7) Due schedule sono equivalenti se - producono lo stesso valore per ogni item su cui viene effettuato un wlock (le formule che danno i valori finali per ciascun item sono le stesse) - ogni operazione rlock(X) legge lo stesso valore di X nei due schedule Testare la serializzabilità Algoritmo Dato uno schedule S • Passo 1 • crea un grafo diretto G (grafo di serializzazione) nodi: transazioni archi: Ti Tj (con etichetta X) se in S - Ti esegue una rlock(X) o una wlock(X) e Tj è la transazione che esegue la successiva wlock(X) - Ti esegue una wlock(X) e Tj esegue una rlock(X) dopo che Ti ha eseguito la wlock(X) e prima che un’altra transazione esegua una wlock(X). O Tj esegue la successiva wlock (dopo una rlock o una wlock, è indifferente) oppure esegue la rlock tra due wlock (potrebbero esserci più rlock tra due wlock e quindi più archi che partono da Ti ) --- # Lock a 3 valori pt.2 Una transazione nel modello a tre valori è a due fasi se nessuna operazione di lock (rlock o wlock) segue una operazione di unlock - Se ogni transazione in un insieme T è a due fasi allora ogni schedule di T è serializzabile - Solo se tutte le transazioni sono a due fasi possiamo avere la certezza che ogni schedule è serializzabile [PDF 25 slide 16] Algoritmo 2 • Dato uno schedule S • Passo 1 • crea un poligrafo diretto P nodi: transazioni+T0+Tf archi • vengono creati gli archi in accordo al vincolo 1: se una transazione T2 legge il valore di un item X scritto da una transazione T1 viene aggiunto l’arco T1 T2 • vengono eliminati tutti gli archi entranti in transazioni inutili (una transazione inutile T può essere individuata facilmente perché non c’è nessun cammino in P da T a Tf ) --- # Deadlock [PDF 26 slide 2](26%20-%20Deadlock%20e%20livelock%20-%20Protocollo%20di%20locking%20a%20due%20fasi%20stretto.pdf#page=2) Un deadlock si verifica quando - ogni transazione in un insieme T è in attesa di ottenere un lock su un item sul quale qualche altra transazione nell’insieme T mantiene un lock e quindi - rimane bloccata e quindi - non rilascia i lock e quindi - può bloccare anche transazioni che non sono in T --- # Timestamp [PDF 27 slide 2](27%20-%20Time-stamp.pdf#page=2) Il timestamp identifica una transazione è assegnato alla transazione dallo scheduler quando la transazione ha inizio può essere - il valore di un contatore - l’ora di inizio della transazione [PDF 27 slide 4](27%20-%20Time-stamp.pdf#page=4) Uno schedule è serializzabile se è equivalente allo schedule seriale in cui le transazioni compaiono ordinate in base al loro timestamp --- # Timestamp pt.2 [PDF 27 slide 4](27%20-%20Time-stamp.pdf#page=5) Quindi uno schedule è serializzabile se: per ciascun item acceduto da più di una transazione, l’ordine con cui le transazioni accedono all’item è quello imposto dai timestamp $$ $$