Reliability For Network Swapping Systems That Support Migration Of Remotely Swapped Pages
Document Type
Conference Proceeding
Publication Date
2004
Published In
Proceedings Of The 16th IASTED International Conference On Parallel And Distributed Computing And Systems
Abstract
Network swapping systems allow individual cluster nodes with over-committed memory to use the idle mem ory of remote nodes as their backing store, and to swap their pages over the network. As the number of nodes in a cluster increases, it becomes more likely that a node will fail or become unreachable, making it important that such a system provide reliability support. Without relia bility, a single node crash can affect programs running on other cluster nodes by losing remotely swapped page data that was stored on the crashed node. Our network swap ping system, Nswap, has design features that complicate reliability: swapped pages can migrate from one node to another in response to changes in a node's local memory needs. As a result, reliability schemes that rely on fixed placement of page and reliability data are not applicable to our system. Our reliability solutions solve the unique challenge of providing reliability to network swapping sys tems that both support dynamic changes to the size of re mote RAM swap space and support migration of remotely swapped page data. Results show that even though our Mir roring reliability scheme adds time and space overhead to Nswap, it still outperforms swapping to disk by a factor of up to 8.2. Our dynamic Parity scheme will provide reliabil ity with minimal time and space overhead.
Keywords
Cluster computing, Network RAM, reliability
Published By
ACTA Press
Conference
16th IASTED International Conference On Parallel And Distributed Computing And Systems
Conference Dates
November 9-11, 2004
Conference Location
Cambridge, MA
Recommended Citation
Tia Newhall; Benjamin R. Mitchell , '05; and Julian A. Rosse , '04.
(2004).
"Reliability For Network Swapping Systems That Support Migration Of Remotely Swapped Pages".
Proceedings Of The 16th IASTED International Conference On Parallel And Distributed Computing And Systems.
https://works.swarthmore.edu/fac-comp-sci/79