I recently configured a Linux based system running BGP (Quagga / Zebra) to allow SNMP data to be collected by a monitoring server. Initially all looked fine, interface metrics and CPU utilization was all showing; randomly the monitoring server was unable to poll the router.
Checking on the router the SNMPD process was maxed out at 100% CPU, after running an strace I discovered that a race condition was occurring when the monitoring server initiated a poll; the SNMPD process was looping over the ipRouteEntry for each interface and would never complete. I already had a feeling that the size of the routing table would cause issues with SNMP polling (as the monitoring server pulls back everything below the root OID) as the routing table contains ~ 500k entries!
After checking the man pages I found nothing useful, other than a few bug reports; after a bit of tweaking I managed to exclude checking of the OIDs that caused this condition to occur. The main fixer here is to create a ‘view’, and exclude the routing based OIDs, then apply this view against your monitoring server community entries.
I wanted to share this as it is a bit of a stinker, especially if you want to run SNMPD on a Linux router with a large routing table. Hope it helps!
Below is a copy of my SNMPD config.
proc /usr/lib/quagga/zebra proc /usr/lib/quagga/bgpd proc /usr/lib/quagga/ospfd interface eth0 6 100000000 interface eth1 6 100000000 # Community to security names com2sec local localhost public com2sec mon-svr 192.168.29.10 public # Security to group names group monitoring v2c local group monitoring v2c mon-svr # Define view view no_routes included .1 80 # These are required to prevent SNMPD racing view no_routes excluded ipRouteEntry view no_routes excluded ipForward view no_routes excluded ipRouteDest view no_routes excluded ipRouteTable view no_routes excluded ipNetToMediaTable view no_routes excluded ipNetToPhysicalTable view no_routes excluded at access monitoring "" any noauth exact no_routes none none