PeerMon: A Peer-To-Peer Network Monitoring System
Document Type
Conference Proceeding
Publication Date
2010
Published In
Proceedings Of The 24th International Conference On Large Installation System Administration
Abstract
We present PeerMon, a peer-to-peer resource monitoring system for general purpose Unix local area network (LAN) systems. PeerMon is designed to monitor system resources on a single LAN, but it also could be deployed on several LANs where some inter-LAN resource sharing is supported. Its peer-to-peer design makes Peer-Mon a scalable and fault tolerant monitoring system for efficiently collecting system-wide resource usage information. Experiments evaluating PeerMon's performance show that it adds little additional overhead to the system and that it scales well to large-sized LANs. Peer-Mon was initially designed to be used by system services that provide load balancing and job placement, however, it can be easily extended to providemonitoring data for other system-wide services. We present three tools (smarterSSH, autoMPIgen, and a dynamic DNS binding system) that use PeerMon data to pick "good" nodes for job or process placement in a LAN. Tools using PeerMon data for job placement can greatly improve the performance of applications running on general purpose LANs. We present results showing application speed-ups of up to 4.6 using our tools.
Conference
24th International Conference On Large Installation System Administration
Conference Dates
November 7-12, 2010
Conference Location
San Jose, CA
Recommended Citation
Tia Newhall; Janis Libeks , '10; Ross K. Greenwood , '11; and Jeff Knerr.
(2010).
"PeerMon: A Peer-To-Peer Network Monitoring System".
Proceedings Of The 24th International Conference On Large Installation System Administration.
https://works.swarthmore.edu/fac-comp-sci/75
Comments
A recording of this presentation is freely available courtesy of USENIX.