Interview With Doug Nebeker: Importance Of Monitoring Server

Expert Interview Series: Doug Nebeker Of Power Admin On The Importance Of Monitoring Your Server for NETSCOUT

2015 年 10 月 29 日

時は金なり。That means that unscheduled downtime of a computer network is lost money, while an underperforming server can make a business less profitable than it could be.

In today's demanding digital business climate, every business - no matter what size - needs to be operating as efficiently as possible to remain competitive and flourish. Monitoring your server is a simple and effective way to make sure that your network is performing as profitably as possible.

We talked to Doug Nebeker from Power Admin to learn about the importance of monitoring your server.

For monitoring a server to be ultimately effective, a system administrator needs to have a clear plan and set goals. What are some different goals that an admin can monitor, and why are they important?

A very common goal that system administrators try for is typically uptime or availability expressed in "nines," with five nines (99.999%) being the gold standard. Hitting five nines means a system had less than 6 minutes of unscheduled downtime in the year. Quite often, an SLA (Service Level Agreement) will specify how much availability is expected, and that becomes the minimum the system administrator will need to hit.

Troubleshooting server performance is one of the main reasons for monitoring a server. What are some signs that a server isn't performing as well as it could for admins to watch out for?

There are a number of measurements that indicate where a server is struggling. If memory is constantly being swapped to disk, the amount of server memory is too low and causing a bottleneck. If the CPU is constantly pegged, a more powerful CPU or just more CPUs will often relieve that bottleneck. If the read or write queue for the disk gets too large, the disks aren't keeping up with the demand for their services and moving to a faster disk subsystem would help performance. The same thing goes for the network card's write queue; if it's filling up, the network isn't keeping up with the demand.

Monitoring a server is useful for preventing problems rather than solving them once they've occurred. What are some signs of potential problems that can be unearthed by monitoring a server?

Low disk space is probably the easiest problem to watch for, usually with fairly bad results when it's not caught. On Windows, if there are problems writing to a disk, there is usually an event written to the Event Log that would notify you a disk might fail soon and should be replaced before it happens. With good historical reports and statics about CPU, memory, disk and network usage, you should be able to do capacity planning to see when your systems will be at capacity - so some planning ahead can prevent problems of systems not keeping up with demand.

Server monitoring solutions continually scan server performance. What are some situations where this is more effective and efficient than performing periodic scans?

We have had more than one customer that would spend all morning logging in to each server they were responsible for to check disk space, check for system errors, and look at how much memory the server was using. There are two problems with that:

  1. System administrators are expensive, and they could be working on higher value work than just "babysitting" servers.
  2. The system administrator only sees the server for that few minutes, leaving the server unmonitored for the rest of the day and night. Memory or disk space might have been a problem an hour earlier, but that won't be discovered.

Contrast that with server monitoring software that can check each of those values every minute. Problems don't go unnoticed, day or night. And good historical data can be captured which will allow for spotting trends and doing capacity planning for the future.

How much time and money does a company stand to lose by going offline? What about the loss to reputation and trust?

Dunn & Bradstreet report that 59% of Fortune 500 companies experience a minimum of 1.6 hours of downtime per week. They use the example of a large company with 10,000 employees that are paid an average of $56 per hour (this includes benefits, employer costs, etc). With that example, the downtime would cost $896,000 per week, and that is only the cost associated with lost productivity. If an e-commerce server is down, the cost in lost sales could be much higher.

Even if the company isn't online but the servers are responding slowly, customers will go elsewhere. According to studies, users will wait 3 seconds at most for a page to load. So servers need to be up, available, and operating well all the time.

Monitoring a server allows a system admin to customize their monitoring to meet specific business needs. What are some other metrics that can be measured by monitoring a server?

Server monitoring isn't just about CPU and memory. It can also keep track of how many visitors are connecting to a web server, how fast a web page is loading, how many email messages are sent and received, how many transactions are taking place in a database, and much more. Some of these numbers are useful to system administrators for keeping an eye on load, but they can also be useful to other groups in the organization to understand interactions with potential customers.

Using server monitor solutions, is it possible to export analytics to other programs like Excel or Google Docs? If so, what are some potential applications of this feature?

Most server monitoring solutions collect historical data and can then create reports in a variety of formats including bar and line charts, tables of data, and exporting to CSV (Comma Separated Value) files which can easily be imported into Excel. This allows for more advanced queries and reports for your particular need - perhaps matching the data against data from another system such as a call center data.

To Learn More About How To Optimize Your Network's Profitability, Read Our Whitepaper "The Cost Of Network Efficiency".

For more updates from Power Admin, follow them on Twitter and Google+.

Powered By OneLink