The monitoring managmeent system of a high-performance computing center consists of related hardware and cluster monitoring software. The hardware includes login management node, KVM system and monitoring management system.
Login management node
The management node is mainly used to run the cluster monitoring management services such as user information management, Infiniband sub-network management, job scheduling service, system monitoring service and time synchronization service. The management node does not require a high performance, but it requires higher reliability. To improve availability, two or more management nodes should be configured. The critical system should be configured as redundant mode.
The login node is used for user interaction jobs such as user program compiling, algorithm preparation, file uploading/downloading and job submission control. The login node load will change much with the user quantity and operation. The login node may crash down due to illegal user operation, so the login node should not be multiplexed with the management node to improve reliability of the whole system. If the user access traffic is high, multiple login nodes can be configured to share the user traffic.
Monitoring managmeent software
An excellent high-performance computing platform not only provides high performance and high reliability, but also is easy to operate and manage. The Sugon Gridview cluster monitoring management system provides a simple, easy-to-use, friendly and central cluster monitoring, management and operation platform to users and administrators, and provides the cluster deployment, cluster monitoring, cluster management, alarm management, statistics report, job scheduling and other functions.
Tiankuo I610-G30, developed by Sugon based on Intel ? xeon ? extensible processor platform, has extensive uses as a two-way server. In 1U height space, I610-G30 perfectly integrates the performance, extensibility and density. It is not only applicable to data centers which are demanding on server’s performance such as online games but also to business environment which is demanding on server density and extensibility, such as internet, IDC and cloud computing, etc.
A620-G30 is an enterprise-level flagship 2-way server developed by Sugon based on AMD latest Naples processor platform. Based on independently development, A620-G30 has high-end specification, powerful processing capability and great I/O scalability which meet the various business requirements of running mission critical applications stable, reliable and efficient. A620-G30 server is a good fit for the industrial data centers with rigid requirements for server performance, scalability, and reliability, such as government, Internet, Power, Telecom and Finance and remote enterprise environments.