Database System Architecture
Database System Architecture
▪ Test-And-Set(M)
• Memory location M, initially 0
• Test-and-set(M) sets M to 1, and returns old value of M
▪ Return value 0 indicates process has acquired the mutex
▪ Return value 1 indicates someone is already holding the mutex
• Must try again later
▪ Release of mutex done by setting M = 0
▪ Compare-and-swap(M, V1, V2)
• Atomically do following
▪ If M = V1, set M = V2 and return success
▪ Else return failure
• With M = 0 initially, CAS(M, 0, 1) equivalent to test-and-set(M)
• Can use CAS(M, 0, id) where id = thread-id or process-id to record
who has the mutex
▪ Prefetching
• Prefetch items that may be used soon
▪ Data caching
• Cache coherence
▪ Lock caching
• Locks can be cached by client across transactions
• Locks can be called back by the server
▪ Adaptive lock granularity
• Lock granularity escalation
▪ switch from finer granularity (e.g. tuple) lock to coarser
• Lock granularity de-escalation
▪ Start with coarse granularity to reduve overheads, switch to finer
granularity in case of more concurrency conflict at server
▪ Details in book
▪ Data Caching
• Data can be cached at client even in between transactions
• But check that data is up-to-date before it is used (cache coherency)
• Check can be done when requesting lock on data item
▪ Lock Caching
• Locks can be retained by client system even in between transactions
• Transactions can acquire cached locks locally, without contacting
server
• Server calls back locks from clients when it receives conflicting lock
request. Client returns lock once no local transaction is using it.
▪ Similar to lock callback on prefetch, but across transactions.
▪ Batch scaleup:
• A single large job; typical of most decision support queries and
scientific simulation.
• Use an N-times larger computer on N-times larger problem.
▪ Transaction scaleup:
• Numerous small queries submitted by independent users to a shared
database; typical transaction processing and timesharing systems.
• N-times as many users submitting requests (hence, N-times as many
requests) to an N-times larger database, on an N-times larger
computer.
• Well-suited to parallel execution.
▪ Bus. System components send data on and receive data from a single
communication bus;
• Does not scale well with increasing parallelism.
▪ Mesh. Components are arranged as nodes in a grid, and each component
is connected to all adjacent components
• Communication links grow with growing number of components, and
so scales better.
• But may require 2√n hops to send message to a node (or √n with
wraparound connections at edge of grid).
▪ Hypercube. Components are numbered in binary; components are
connected to one another if their binary representations differ in exactly
one bit.
• n components are connected to log(n) other components and can
reach each other via at most log(n) links; reduces communication
delays.
▪ Tree-like Topology. Widely used in data centers today
▪ Ethernet
• 1 Gbps and 10 Gbps common, 40 Gbps and 100 Gbps are
available at higher cost
▪ Fiber Channel
• 32-138 Gbps available
▪ Infiniband
• a very-low-latency networking technology
▪ 0.5 to 0.7 microseconds, compared to a few microseconds for
optimized ethernet
▪ Shared memory system can have multiple processors, each with its own
cache levels
▪ Cache coherency:
• Local cache may have out of date value
• Strong vs weak consistency models
• With weak consistency, need special instructions to ensure cache is
up to date
▪ Memory barrier instructions
• Store barrier (sfence)
▪ Instruction returns after forcing cached data to be written to
memory and invalidations sent to all caches
• Load barrier (lfence)
▪ Returns after ensuring all pending cache invalidations are
processed
• mfence instruction does both of above
▪ Locking code usually takes care of barrier instructions
• Lfence done after lock acquisition and sfence done before lock release
▪ Network partitioning
▪ Availability of system
• If all nodes are required for system to function, failure of even one
node stops system functioning.
• Higher system availability through redundancy
▪ data can be replicated at remote sites, and system can function
even if a site fails.
▪ Services
▪ Microservice Architecture
• Application uses a variety of services
• Service can add or remove instances as required
▪ Kubernetes supports containers, and microservices