Michael Lowe of Indiana University will present a seminar on Future Directions in Large Scale Systems Monitoring
When: 3:15 p.m. Thursday September 30, 2010
Where: Mesa Lab Main Seminar Room
As HPC systems have grown in size and complexity, monitoring of these systems hasn't kept pace. Current systems either don't scale or are the wrong fit, some systems are comprised of scripts systems administrators have migrated from machine to machine. Attempts to select a monitoring solution are further complicated by requirements for sharing data across administrative boundaries and existing monitoring systems. The current state of monitoring HPC resources will be discussed along with the motivations for finding new solutions. Ongoing experiments involving message buses, column store databases, micro formats, python, and failure prediction will also be discussed.