Analyzing the load balance of term-based partitioning
View/ Open
Date
2011Author
Abusukhon, A.
Talib, M.
Publisher
The Science and Information Organization Inc., http://ijacsa.thesai.org/Type
Published ArticleMetadata
Show full item recordAbstract
In parallel (IR) systems, where a large-scale collection
is indexed and searched, the query response time is limited by the
time of the slowest node in the system. Thus distributing the load equally across the nodes is very important issue. Mainly there are two methods for collection indexing, namely document-based and
term-based indexing. In term-based partitioning, the terms of the
global index of a large-scale data collection are distributed or
partitioned equally among nodes, and then a given query is divided into sub-queries and each sub-query is then directed to the relevant node. This provides high query throughput and
concurrency but poor parallelism and load balance. In this paper, we introduce new methods for terms partitioning and
then we compare the results from our methods with the results from the previous work with respect to load balance and query response time.