commit fbfe4dffb459929a40056eda62af7e4afdc18e6f · tulkdan.dev/Languages

+64

SoftwareArchitectures/DistributedSystem.org

··· 1 + * ACID 2 + Acronym that refers to: 3 + - Atomicity :: entire statement is executed, or none of it is. Prevents data loss and corruption from ocurring 4 + - Consistency :: ensures that changes are only made do tables predefined and predictable ways. Ensure that corruption or errors in your data do not create unintended consequences for the integrity of your table 5 + - Isolation :: each transaction is treated as independent from any other. Ensure that multiple users can read and write asynchronously in the same database, without interference 6 + - Durability :: ensure that changes to your data made by successfully executed transactions will be saved, even in the event of system failure 7 + * Distributed systems 8 + 9 + ** Concurrency 10 + *** Pessimistic concurrency control 11 + most of the transactions will try to access the same resource simultaneously. Used to prevent cocurrent access to a shared resource and provide a system of acquiring a Lock on the data item. *Problem*: if a transaction acquires a lock on a resource so that no other transactions can access it, this might result in recuding concurrency of the overall system 12 + 13 + *** Optmistic concurrency control 14 + has an assumption that 0 or very few trnasactions will try to access a certain resource simultaneously, has 4 phases of operation: 15 + - Read phase :: reads the data while also logging the timestamp at which data is read to verify for conflicts during the validation phase 16 + - Execution phase :: executes all its operation like create, read, update or delete 17 + - Validation phase :: before commiting the transaction, a validation check is performed to ensure consistency by checking the *last_updated* timestamp with the one recorded at *read_phase*. if the timestam matches, then the transaction will be allowed to be committed and hence procesed with the commit phase 18 + - Commit phase :: the transaction will either be committed or aborted, depending on the validation check performed during previous phase 19 + 20 + * Idempotency 21 + Ensures that repeated calls with the same input yield the same result 22 + 23 + - enhances system reliability by ensuring consistent outcomes 24 + - allows safe retries without causing duplicate actions 25 + - prevents data corruption from repeated or failed requests 26 + - allow systems to retry operations without negative consequences 27 + - ensure multiple instances can handle requests consistently 28 + 29 + * Starvation 30 + Also has the namo of indefinite blocking 31 + 32 + phenomenon associated with the priority scheduling algorithms, in which a process ready for the CPU (resource) can wait to run indefinitely becauso of low priority 33 + 34 + * Sharding database 35 + It is a pattern related to horizontal partitioning - the practice of reparating one table's rows into multiple different tables, known as partitions. 36 + Each partition has the same schema and columns, but different rows and data. 37 + 38 + Sharding involves breaking up one's data into two or more smealler chunks, called logical shards. 39 + The logical shards are then distributed across separate database nodes, referred to as physical shards, which can hold multiple logical shards. 40 + Despite this, the data held within all the shards collectively represent an entire logical dataset 41 + 42 + Shards are autonomous, they don't share any of the same data or computing resources. 43 + In some cases, it may make sense to replicate certain tables into each shard to serve as reference tables. 44 + 45 + Oftentimes, sharding is implemented at the application level, meaning that the application includes code that defines which shard to transmit reads and writes to. 46 + Some database management systems have sharding capabilities built in, allowing you to implement sharding directly at the database level. 47 + 48 + ** Benefits 49 + - facilitates horizontal scaling 50 + - query response times are lower, because the query is running on low data, instead of a monolithic database that the query might need to search in all rows 51 + - helps to make application more reliable by mitigating the impact of outages, with this, if an outage happen, it will likely to affect only a single shard, instead of all the database 52 + 53 + ** Drawbacks 54 + - high complexity to implement 55 + - significant risk that the sharding process can lead to lost data or corrupted tables 56 + - high impacto on team's workflow 57 + - shards might become unbalanced 58 + - once sharded, it is very difficult to return it to unsharded architecture 59 + - not supported by every database engine 60 + 61 + ** Sharding architectures 62 + - key based sharding :: involves using a value taken from newly written data and plugging it into a hash function to determine which shard the data should go to 63 + - range based sharding :: involves sharding data based on ranges of a given value 64 + - directory based sharding :: to implement it is necessary to create and maintain a lookup table that uses a shard key to keep track of which shard holds which data. Main appeal is flexibility. The need for a lookup table before every query or write can have a detrimental impact on an application's performance and it can become a single point of failure