Add Thesis

Permissioned Blockchains and Distributed Databases

A Performance Study

Written by Sara Bergman

Paper category

Master Thesis


Computer Science




Thesis: Data replication in distributed systems Data replication is a technology widely used in distributed systems, which is used to maintain copies of data on multiple nodes. Blockchain and database have different data replication methods. In the blockchain, all nodes in the system retain the data of the copy, and in many database systems, the replication factor is adjustable. Copying brings rewards and challenges, and the topic is well covered, such as the book by Coulouris et al. [9]. According to Coulouris et al., correct completion of replication can improve performance because the workload can be distributed across multiple nodes. But it also brings challenges, because ensuring that all copies are up-to-date will incur overhead and therefore negatively affect performance. Higher availability can also be achieved, because a system with replication capabilities can tolerate one or more node crashes and still provide data to users. However, Coulouris et al. It also pointed out that high availability does not necessarily mean that users will receive the latest data, and the system may provide stale data. Therefore, if the system should be fault-tolerant, it is important to consider how many replicas to use and which failure models the system should tolerate. If the system has a Byzantine failure model, and assuming that fservers can exhibit Byzantine failures, then a total of 3f+1 servers are required to maintain a working system [9]. On the other hand, the iffnodes off+1nodes in the system crashed, and the nodes still standing are sufficient to deal with the failure of the failure stopper crash [9]. In short, when choosing the number of replicas, priority is given to consistency and availability, as well as failure models to be considered important aspects. 2.2 Docker-operating system-level virtualization software Docker is a container platform, not to be confused with a virtual machine, because it does not include a guest operating system. Docker is a thin layer that runs on the host operating system. The container runs on Docker, as shown in Figure 2.1. A container is a lightweight, executable version of a program that contains what it needs to run on Docker, as well as source code and dependencies1. Each container is separate from other containers, and multiple containers can run at the same time2. Docker can be used for rapid development, deployment and management of software. It is very useful when deploying a distributed system on a local machine, because the container provides different islands for each node to run, and Docker can simulate the network between nodes. 2.3 Blockchain Blockchain technology was originally invented by the person or person behind Satoshi Nakamoto [18], the structure of the cryptocurrency Bitcoin. Their paper describes the blockchain as distributed and decentralized. 2.3.2 Permission attributes and blockchain scope Blockchains can have different permission attributes and blockchain scopes. These are two design decisions discussed by Xu et al. [twenty four]. According to Xu et al., the authority attribute refers to the degree of trust in the point. The permission attribute can be permissionless or permissioned. Xu et al. A permissionless blockchain is defined as a blockchain that allows peers to join and verify transactions without permission. In a permissionless blockchain, the peer has no trust, and the system needs to rely on technologies such as PoW and PoS to establish trust in the trustless system. Examples of permissionless blockchain frameworks are Bitcoin and Ethereum. The permissioned blockchain is defined by Xu et al. As a blockchain, certain permissions require permission from peers to allow them to participate, so peers with permissions are trusted. When peerscan is trusted, PoW or Pos is not required. Examples of permissioned blockchains are Fabric and Ripple, which is a payment infrastructure. Some blockchain frameworks are permissioned, which means they can be configured as permissionless or permissioned. An example of a licensing framework is Hyperledger Sawtooth.Xu et al. [24] The scope of the blockchain is defined as the defining factor that determines whether the network is a public, private, or consortium/community blockchain. Private or public issues are issues of peer anonymity. Xu et al. defined a blockchain as a public blockchain that anyone can access anonymously at any given time. Bitcoin is an example. This means that the information on the blockchain can be viewed by anyone, but no one can become a node in the blockchain network and does not approve transactions. Xu et al. If the blockchain can only be accessed by certain trusted peers, such as the blockchain Ripple or Eris, then the blockchain is defined as private. The number of these trusted peers may change over time, and they are not anonymous. 2.4 Distributed database A database is an organized collection of data stored electronically. The database has a database management system (DBMS) that interacts with end users and applications and manages the data in the database. In recent years, DBMS has become a synonym for database [1], and this article uses the term database as a synonym for DBMS. Databases can be classified in different ways, according to the data they contain, their application areas, or the specific technical aspects they exhibit. One type of database is a distributed database, where data and DBMS are distributed on multiple computers. This is the type of database of interest in this work. Read Less