Introduction to Databases

information: data with meaning

knowledge: actionable information

exploratory data analysis (EDA)

relation ≈ table

data definition language (DDL): for schema

data manipulation language (DML): for query and CRUD

DML & DDL both in SQL

data model (DM): entity, attribute, relationship, constraint

entity relationship diagram (E-R diagram) (ER diagram)

unified modeling language (UML): for OODM

relation schema

relation instance

  • element

attribute usually atomic

superkey : subset of row tuple, bijection with row

  • superset of is also superkey

candidate key = minimal superkey

union compatibility/ type compatibility

data control language (DCL) & transaction control language (TCL): management

like: any substring %, any char _

for where clause filter: like, between, some, all

for where clause standalone: exists, unique

union, intersect, except (all)

select x, fn(y) from _ group by x having …

boolean type: true, false, unknown

select _ from _ where x in y;
-- lateral clause:
select _ from x, lateral (/* access outside variables */) …;
-- with clause:
with alias_name as (…) select …;
delete from _ where _;
insert into table_name (attributes, …) values (…,), (…,), …
update table_name set attributes = value where …;
-- join using
select _ from relation1 join relation2 using (attr1, …);
-- view/ materialized view
create view view_name as select …;
create materialized view view_name as select …;

domain constraint on single relation: not null, unique, check (…)

referential integrity: foreign key (attr1) references table_name on delete …

named constraint: attr1 type, constraint constr_name check (…)

  • drop named constraint: alter table table_name drop constraint constr_name;
  • deferrable constraint: check at end of transaction

assertion

create assertion assert_name check (…);

function call

  • coalesce(attr1, default_value_in_place_of_null)
  • cast(attr1 as type)
  • data_format(value, 'format string')
    • if(predicate, value_when_true, value_when_false) or decode in Oracle

user-defined type (UDT): distinct type/ structured data type

create type type_name as …;
create domain domain_name as …;

create table extension

create table table2 like table1;
create table table1 as (select …) with data;

stored function/ stored procedure:

delimiter $$
create function fn_name(arg1 type1, …) returns type_out begin
    -- declare local variable, set to NULL by default
    declare local_var var_type default default_val;
    -- mutate local variable
    set local_var = …;
    select _ into local_var from …;
    return …;
end $$
-- `type_out` can be a `table (…)`—table function

create procedure proc_name(
    in arg_input type1, out arg_output type2, inout arg_mutate type3, …
) begin
    -- …
end $$
-- no returning for procedure

delimiter ;

-- call procedure
call proc_name(args…, @outside_var);

stored function: deterministic/ non-deterministic (default)

procedural SQL:

-- expression with pattern matching
case -- optionally with value here
    when _ then _
    …
    else _
end;

-- loop
label1: loop
    iterate; -- continue
    leave; -- break
end loop;
while predicate1 do
    -- …
end while;
repeat
    -- …
until predicate1 end repeat;
for each_row as table_value1 do
    -- …
end for;

trigger:

create trigger trigger_name after insert on table_name -- or `before`, `delete`
referencing new row as row_name for each row when ( -- or `old row`.
-- or without renaming `new` or `old`.
    -- …
)
begin -- compound statement
    rollback
end;

-- multiple trigger
create trigger trigger_name before update on table_name
for each row follows another_trigger_name begin
    -- …
end, $$

error handling:

declare continue handle for sqlstate 'err_no' begin -- or for `not found`
    -- …
end; -- or `set _ = …`

cursor:

declare cur_name for select _ from _;
open cur_name;
fetch cur_name into var1, …;
close cur_name;

entity-relationship model (E-R model): entity set, relationship set, attribute

  • multi-valued attribute, composite attribute, derived attribute
  • mapping cardinality, cardinality constraint (directed/undirected link)
  • total participation (double line)/ partial participation
  • weak entity set
    • identifying entity set primary key → discriminator attribute
    • descriptive attribute

lossy decomposition: (lossless: )

functional dependency (FD) : can uniquely identify by

  • trivial FD:
  • lossless decomposition (or swap 1&2)

Boyce-Codd Normal Form (BCNF): or superkey

mean time to failure MTTF

RAID: data striping vs mirror

fixed-length record: deletion: free list for deletion

variable-length record: offset + length, null bitmap

heap: free-space map (first/second-level)

sequential file: linked list of sorted search key

multi-table clustering file organization: faster some join & slower some other, cluster key

table partitioning

buffer manager: buffer replacement strategy (LRU/MRU), pin/unpin, shared/exclusive lock

column-oriented storage: less IO, better caching & compression

dense index: index for each search-key value

  • dense clustering index: group by cluster key

sparse index: index for some search-key value

deadlock prevention:

  • wait-die scheme, non-preemptive: older transaction wait for lock, younger transaction die
  • wound-die scheme, preemptive: older transaction wound younger transaction when waiting for lock
  • timeout-based scheme: hard to determine timeout, starvation
  • wait-for graph: cycle → deadlock

recovery system:

  • log-based recovery: redo/undo, immediate/deferred DB modification