Cisco Network Mgmt Protocol FAQ: Management Communication Patterns
Q1. What are the fundamental interaction patterns between the management agents?
Answer: Manager-initiated request and response, and agent-initiated events.
Q2. Assume that you have a network with 1000 devices and 1500 links. Assume that a performance management application is interested in 18 performance parameters per link and 7 performance parameters per device. Assume that with incremental information-retrieval requests, you can retrieve 5 parameters at a time. Someone asks you to build an application that will keep a database of historical information of those parameters, using 15-minute intervals. What rate of management requests and responses must your application support per second? Furthermore, if it takes an average of 5 seconds to receive a response from a device, how many requests must the application be capable of handling in parallel?
Answer: Polling the link parameters requires four requests per link, and polling device parameters requires two requests per device. This means that 1500 × 4 + 1000 × 2 = 8000 requests and responses need to be handled per 15-minute interval. Fifteen minutes have 900 seconds, so assuming that there will be some request timeout and retries, the sustained rate needs to be approximately 10 requests and responses per second. If it takes 5 seconds to receive a response from the device, the application needs to be capable of handling an average of 50 outstanding—or parallel—requests at any given point in time.
Q3. What do you call the capability to apply the same management operation to multiple managed objects simultaneously, using only one management request?
Q4. One important technique that could be supported by devices to facilitate management transactions involves locking the device—that is, allowing a single management session to take management “ownership” of the device and allow no one else to modify the configuration during that time. Such a capability is very powerful, but in what ways does it still fall short of true management transaction support? For bonus points, can you think of new management issues that it introduces?
Answer: The capability to lock a device is not the same as a management transaction. For example, it does not include rollback capabilities. The capability to lock a device imposes performance limitations if multiple management applications and users want to configure different aspects of the device simultaneously— operations need to be serialized instead of being executed concurrently. To be effective, locking requires applications to be fair and not “hog” a device—that is, apply a lock without releasing it. It also requires a mechanism to break locks, for example, if an application is no longer capable of releasing a lock (or forgets to do so).
Figure: A Management Transaction on a Management Agent
Q5. One technique that can be used to roll back management transactions involves reverting to an earlier configuration file. Discuss advantages and drawbacks of this technique.
- Very effective
- Simple, straightforward application logic and semantics
- Performance overhead in persisting configuration file at the beginning of a transaction, as well as reverting to a configuration file in case of rollback, which could make the system too slow to be practical
- Does not provide “locking,” and may result in also rolling back configurations done independently by someone else at the same time during which the transaction took place
Q6. Why can management actions never be subjected to management transactions?
Answer: They might involve real effects that cannot be undone as part of a rollback, such as an action that leads to a reboot.
Q7. In network management, what is an alarm?
Answer: An alarm is an event that indicates the onset or the remission of an alarm condition.
Q8. Does a TCA have more in common with a configuration-change event or an alarm? Why?
Answer: A threshold-crossing alert is essentially a special type of alarm. Like an alarm, it is used to notify the onset (or remission) of a condition—in this case, the crossing of a threshold.
Q9. Is it possible to support polling-based alarm management? If so, why is alarm management generally event based?
Answer: It would be possible if devices maintain the current alarm state on the device. A management application would repeatedly need to poll the devices for their alarm information. Of course, in practice, this is infeasible for several reasons: It would imply an extremely high load on an alarm management application (compare this with the earlier question on polling-based performance management). In addition, there will be a delay between polling cycles that prevents alarm management in near–real time and that practice is unacceptable. Event-based alarm management is therefore the only feasible option.
Q10. Name three techniques that can be used to make events reliable.
Answer: Use of a reliable transport protocol, consecutive sequence numbers and replay capability, acknowledgment of events and retransmission capability.