Written by Ted| in Network Monitoring|0 Comments
/ 5 min read

NMS Polling


Network Monitoring is integrated directly into Alpha|Stack, keeping with our core philosophy that everything your business needs to run smoothly is available in one place, with information shared to all relevant parties seamlessly!

Network monitoring is a core requirement for the day-to-day operations of any ISP. It is important to have excellent performance monitoring in order to catch issues before they happen, respond to outages, and to provide historical time series data for tracking changes over time.

There are a myriad network monitoring solutions on the market, from open source platforms like Zabbix, to commercial offerings like SolarWinds, and many, many more in between.

With Alpha|Stack, Network Monitoring comes integrated out of the box, keeping with our core philosophy that everything your business needs to run smoothly is available in one place, with information shared to all relevant parties seamlessly. Customer support representatives have access to charts, graphs, alerts, and customer statuses without having to file support requests with the network team, and the Alpha|Stack network monitoring system contains everything your core network team has come to expect with a whole lot more than what they thought was possible!

Monitoring overview post image
Alpha|Stack Network Monitoring

Letting support staff view statistics on devices from the same platform they’re doing all their other work saves time, shortens calls, and improves customer satisfaction, but we’re here to talk about the technical parts. How does it work?

Meet the Alpha|Stack Poller: GoPoll!

Alpha|Stack GoPoll! is a small application that can run on most appliances. It is written in Go, and can be installed on any local network device capable of running Docker

The majority of network monitoring performed today relies on two protocols — SNMP and ICMP — to collect data about network performance. While there are other options out there like TR-069 and IPFIX, SNMP and ICMP have been the default options for a very, very long time.

ICMP

ICMP allows us to collect data about packet loss and latency of devices. While there are a variety of data points that can be collected using ICMP, measuring latency and calculating packet loss tend to be the main ones. The Alpha|Stack poller uses ICMP (specifically fping) to collect these metrics. This allows us to track the history of a device’s performance.

The other metric we can calculate by collecting this data is jitter. Jitter is the difference in delay between packets, and variable jitter can cause issues with real time communications — voice being the most noticeable one, but it can also be very impactful to online gamers, or other people engaging in real time applications that are heavily reliant on a consistent flow of data.

Network monitoring image for blog post
Latency graph

In the Alpha|Stack graph shown above, the grey portions show the jitter over time, whereas the colored line shows the median latency. As you can see, by only looking at the colored line, the connection looks very stable, but by including the range of response times, we can see there is significant variation at some points.

SNMP

SNMP (Simple Network Management Protocol) has been around for a very long time, and the overwhelming majority of ISP level network devices support SNMP. From a monitoring perspective, SNMP is typically used to collect data from devices (SNMP polling) or to be alerted of events (via SNMP traps.) Alpha|Stack relies on SNMP polling to collect data about devices, and we allow Alpha|Stack users to define any type of polling they like. With SNMP polling, it’s up to the device manufacturer to expose different metrics, and some will give access to a very large array of data, whereas others will limit the information that can be collected.

Typically, a device will expose some standard parameters (for example, the throughput and error rate of physical interfaces, CPU usage, or uptime) and a variety of proprietary information that is relevant to the device in question. For example, a UPS may expose the remaining battery life, or whether or not it’s currently receiving power from the grid. Alpha|Stack allows users to both collect this data, and alert on it.

Powering GoPoll!

GoPoll! runs inside an event loop powered by Go. This EventLoop makes it easy for us to execute non-blocking, asynchronous work to fetch data from devices using ICMP, SNMP, or other methods.

GoPoll! also collects a number of other metrics automatically to drive things like network discovery or re-discovery. Our goal when collecting data is to make as few requests as possible, as the most time intensive part of network monitoring is requesting and receiving data back from the network. This means, for example, using things like SNMPBULKWALK when we can, or performing multiple SNMPGET requests at once, rather than running them sequentially.

When performing SNMP queries, the poller will also query a number of other parameters up front. One of the things we query is the SysObjectID. This is described in the appropriate RFC as follows:

The vendor’s authoritative identification of the network management subsystem contained in the entity.

This value is allocated within the SMI enterprises subtree (1.3.6.1.4.1) and provides an easy and unambiguous means for determining "what kind of box" is being managed. For example, if vendor "ACME, Inc." was assigned the subtree 1.3.6.1.4.1.424242, it could assign the identifier 1.3.6.1.4.1.424242.1.1 to its "ACME-1000 Router".

Once we’ve determined the type of device, we can then collect more specific data from it to aid in providing more detailed information within Alpha|Stack.

Back into Alpha|Stack

Once Alpha|Stack receives the data from the poller, it has to process it before displaying it. The basic metrics that are being collected via SNMP and ICMP are aggregated and stored in an auto-scaling PostgreSQL DB to be presented in the user interface.


Leave a comment