Scaleup in Parallel (Systems) Database
Linear ScaleupScaleup is linear if the resources increase in proportional with the problem size (it is very rare). According to the equation given above, if small system small problem elapsed time is equal to large system large problem elapsed time, then Scaleup = 1 which is linear.
Sub-linear ScaleupScaleup is sub-linear if large system large problem elapsed time is greater than small system small problem elapsed time, then the scaleup is sub-linear.
Further useful discussions:
- If scaleup is 1, i.e Linear, then performance of the system is excellent.
- If scaleup is Sub-linear and the value falls between 0 and 1, then we need extra care to choose our plan for parallel execution. For example, if small system small problem elapsed time is 5 seconds, and large system large problem elapsed time is 5 seconds. This clearly shows Linear. That is, 5/5 = 1. For other values of denominator, especially low values (not possible beyond a limit), we would say that the system performance is excellent. But, for higher values of the denominator, say 6, 7, 8 and so on, the scale up value falls below 1 which needs much attention for better workload redistribution.
Speedup in Parallel (Systems) Database
Linear SpeedupSpeedup is linear if the speedup is N. That is, the small system elapsed time is N times larger than the large system elapsed time (N is number of resources, say CPU). For example, if single machine does the job in 10 seconds and if parallel machine (10 single machine) does the same job in 1 second, then the speedup is (10/1)=10 which is speedup and it is linear because the speedup is achieved due to the 10 times powerful system.
Sub-Linear SpeedupSpeedup is sub-linear if speedup is less than N (which is usual in most of the parallel systems).
Further useful discussions:
- If the Speedup is N, i.e Linear, then it means the expected performance is achieved.
- If the Speedup is not equal to N, then following two cases possible;
- Case 1: If Speedup > N, then it means the system performs more than it is designed for. The Speedup value in this case would be less than 1.
- Case 2: If Speedup < N, then it is Sub-linear. In this case, the denominator (large system elapsed time) is more than the single machine's elapsed time. In such case, the value would vary between 0 and 1, where we need to fix a threshold value such that the value below threshold should not be accepted as parallel operation. Such a system needs extra care to redistribute the workload among processors.
What are the reasons for Sub-linear performance of speedup and scaleup
Startup costs.For every single operation which is to be done in parallel, there is an associated cost involved in starting the process. That is, breaking a given big process into many small processes consumes some amount of time or resources. For example, a query which tries to sort the data of a table need to partition the table as well as instructing various parallel processors to execute the query before any parallel operation begins.
This query sorts the records of Emp table on Salary attribute. Let us assume that there are 1 million employees, so one million records. It will consume some time to execute in a computer with minimal resources. If we would like to execute the query in parallel, we need to distribute (or redistribute) the data into several processors (in case of Shared Nothing Architecture), also, we need to instruct all the processors to execute the query. These two operations need some amount of time well before any parallel execution happens.