TY - GEN
T1 - Performance models for cluster-enabled OpenMP implementations
AU - Cai, Jie
AU - Rendell, Alistair P.
AU - Strazdins, Peter E.
AU - Wong, H'sien Jin
PY - 2008/11/17
Y1 - 2008/11/17
N2 - A key issue for Cluster-enabled OpenMP implementations based on software Distributed Shared Memory (sDSM) systems, is maintaining the consistency of the shared memory space. This forms the major source of overhead for these systems, and is driven by the detection and servicing of page faults. This paper investigates how application performance can be modelled based on the number of page faults. Two simple models are proposed, one based on the number of page faults along the critical path of the computation, and one based on the aggregated numbers of page faults. Two different sDSM systems are considered. The models are evaluated using the OpenMP NAS Parallel Benchmarks on an 8-node AMD-based Gigabit Ethernet cluster. Both models gave estimates accurate to within 10% in most cases, with the critical path model showing slightly better accuracy; accuracy is lost if the underlying page faults cannot be overlapped, or if the application makes extensive use of the OpenMP flush directive.
AB - A key issue for Cluster-enabled OpenMP implementations based on software Distributed Shared Memory (sDSM) systems, is maintaining the consistency of the shared memory space. This forms the major source of overhead for these systems, and is driven by the detection and servicing of page faults. This paper investigates how application performance can be modelled based on the number of page faults. Two simple models are proposed, one based on the number of page faults along the critical path of the computation, and one based on the aggregated numbers of page faults. Two different sDSM systems are considered. The models are evaluated using the OpenMP NAS Parallel Benchmarks on an 8-node AMD-based Gigabit Ethernet cluster. Both models gave estimates accurate to within 10% in most cases, with the critical path model showing slightly better accuracy; accuracy is lost if the underlying page faults cannot be overlapped, or if the application makes extensive use of the OpenMP flush directive.
KW - computational modelling
KW - mathematical model
KW - aggregates
UR - http://www.scopus.com/inward/record.url?scp=55849111882&partnerID=8YFLogxK
UR - http://purl.org/au-research/grants/ARC/LP0669726
U2 - 10.1109/APCSAC.2008.4625433
DO - 10.1109/APCSAC.2008.4625433
M3 - Conference contribution
AN - SCOPUS:55849111882
SN - 9781424426836
T3 - 13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008
BT - 13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008
T2 - 13th IEEE Asia-Pacific Computer Systems Architecture Conference, ACSAC 2008
Y2 - 4 August 2008 through 6 August 2008
ER -