|
Our client is a large global media company with clients in six countries - US, UK and the rest of Asia. With more than 100 editors, designers and consultants working out of its facility in NOIDA, on the outskirts of New Delhi on numerous projects, it has a large IT setup to support the operations.
The Challenge
The company has many servers located at different locations of the world. The main issue for the company was to to monitor these boxes simultaneously which in itself is a gargantuan task. The servers were running different services required by the company for successful execution of projects. As a part of their network monitoring requirements, they wanted to monitor all services’ usage and resource utilization like space and load on servers etc to enable them in making the right decisions for utilizing the resources in an optimal way.
They had three separate network monitoring systems and performance management tools running at one location and various teams were using different tools. For example, the DBA team was using one tool, the Network Administrators another, the Unix and Windows teams yet another system to check the network usage. They used a system that sent out critical alerts by email, pager and SMS often to completely inappropriate people. On one hand, the company needed one integrated solution and on the other, minimum downtime.
They had a few key requirements in their mind that would help in operating in a much more effective and efficient manner. Some of them are:
- Platform independence: Their network management system should have the capability of running on the hardware available with them (at the time, SPARC/Solaris).
- Performance: Any solution should be able to handle a few hundred nodes to a few thousand nodes without compromising the performance and enhancing the scalability at no extra efforts.
- Enterprise level features: They required SNMP trap management, configurable alert escalation and availability and performance reports for the management team, all of them.
- Rationalize support roles: They should to be able to take individuals out of the process by automating it as much as possible. That meant an end to emails sent by systems to developers in the middle of the night. Their operations team needed to be the first contact for every event.
- Reduce operation level tasks: They wanted the system to lighten the burden on the Operations Team, not increase it.
- Extensibility: Previous experience indicated that there was no such thing as a complete solution.
- Low cost of entry: It needed to replace a portfolio of Open Source products and thus have one integrated solution.
- Longevity: Some Open Source products seem to wither on the vine with no apparent cause, or fragment through disagreements between developers. Commercial products too were subject to the vagaries of the market. They needed a permanent and long term solution.
The company was growing, and it looked like it was beginning to need a grown-up systems management tool, but which one? That is where the client planned to use expertise of Torrid Networks to help them identify, implement and deploy the best fit solution for them.
The Solution
ZoNIX suggested its own customized OpenNMS based network management solution. Solution has all the features that the company needed. It monitors their UK, US based servers and network devices through distributed monitoring & send notification if any failure of service occurs.
Here's how things looked out:
1. Platform independence: NMS can run on spare hardware. But it's not a good idea. A year after our first rollout of NMS, we moved from a shared SUN Ultrasparc 2 machine to a dedicated dual Xeon machine running RedHat Advanced Server. Thinking from a long term perspective, the idea of hardware independence was very essential. 2. Performance: NMS is scalable and requires on extra efforts for the same. We knew, there will always be users pushing the scalability of the system up thus we recommended NMS. 3. Enterprise Level Features: NMS met our requirements providing all the reports that the client wanted. 4. Rationalize Support Roles: NMS is now the single point for the distribution of all actionable network, server and application events. This does need to be constantly policed, to ensure that non-standard notification paths do not creep in again. 5. Reduce Tasks: This depends on user as well. In general, the operator's load has lessened because it has reduced the numbers of open windows on their desktops. 6. Extensibility: NMS has proved to be highly extensible. It satiated the customer’s needs and had all the features required. 7. Low cost of entry: We deployed NMS with minimal capital outlay. We believe that the subsequent people based operational costs have been roughly equivalent to those of a commercial solution. 8. Longevity: We seem to have backed a product with "legs." The mailing lists are as busy as ever and new features are being added to NMS faster than we can make use of them.
The Results
ZoNIX's NMS allowed them to perform most of their tasks with ease. Some of them are mentioned below:
1. Having a custom report on availability of resources helped them stay abreast of the resource usage details 2. Adding escalation on the notification adds responsibility for different roles. 3. Having alarm generated upon threshold violations helped to fix any issue instantly 4. No upfront licensing cost. 5. Low on-going maintenance cost. 6. Lesser day to day administration work. 7. Maximum advantages at minimum cost.
|