Exploring Secure Voice Solutions
Defining Voice Fundamentals
This section begins by defining voice over IP and considering why it is needed in today’s corporate environment. Because voice packets are flowing across a data infrastructure, various protocols are required to set up, maintain, and tear down a call. This section defines several popular voice protocols, in addition to hardware components that make up a voice over IP network.
Defining VoIP
VoIP sends packetized voice over an IP network. Typically, the IP network serves as a data network as well, resulting in potential quality and security issues. Fortunately, Cisco offers a collection of quality of service (QoS) and security features to ensure the quality and security of voice transmissions.
The ability to transmit voice over an IP network (for example, the Internet) allows many corporate networks to readily interconnect their sites without purchasing dedicated leased lines between their sites or relying on the public switched telephone network (PSTN), which imposes charges for certain call types (for example, long distance and international calls).
With the advent of VoIP technology, some confusion has arisen around its associated nomenclature. For example, consider the terms VoIP and IP telephony. Both refer to sending voice across an IP network. However, the primary distinction revolves around the endpoints in use. For example, in a VoIP network, traditional analog or digital circuits connect into an IP network, typically through some sort of gateway. However, an IP telephony environment contains endpoints that natively communicate using IP.
To further illustrate the distinction between VoIP and IP telephony, consider Figure 9-1. In the top portion of the figure, the endpoints in the VoIP network are an analog phone (connected to an analog port on a gateway) and a private branch exchange (PBX) (connected to a digital port on a different gateway). Because neither of these endpoints natively speaks IP, the topology is considered a VoIP network. The bottom portion of Figure 9-1 shows a Cisco IP phone, which does natively communicate using IP. The Cisco IP phone registers with a Cisco Unified Communications Manager server, which makes call routing decisions on behalf of the Cisco IP phone. Therefore, the bottom topology in the figure is considered an IP telephony network. Realize, however, that some literature might use the terms VoIP and IP telephony interchangeably.
The Need for VoIP
Originally, one of the primary business drivers for the adoption of VoIP was saving money on long distance calls. However, increased competition in the industry drove down the cost of long distance calls to the point that cost savings alone was insufficient motivation for migrating a PBX-centric telephony solution to a VoIP network. However, several other justifications exist for purchasing VoIP technology:
- Reduced recurring expenses: In many traditional PBX-centric networks, a digital T1 circuit typically could carry either 23 or 24 simultaneous voice calls (based on the type of signaling being used). Specifically, a T1 usually had 23 or 24 channels available. Each channel had a bandwidth of 64 kbps and could handle one, and only one, phone call. However, VoIP networks often leverage coder/decoders (codecs) to compress voice. Each voice call consumes less than 64 kbps of bandwidth per call, thereby allowing additional simultaneous calls, as compared to traditional technology.
- Adaptability: Because VoIP networks send voice traffic over an IP network, administrators have a high level of control over the voice traffic. Different customers could be granted access to different voice applications (for example, a messaging application or an interactive voice response [IVR] application).
- Advanced functionality: VoIP and IP telephony networks can also offer advanced features, such as the following:
- Call routing: Existing routing protocols (for example, EIGRP and OSPF) could be used to provide rapid failover to a backup link if a primary network link failed. Additionally, calls could be routed over different network links based on link quality or the link’s current traffic load.
- Messaging: A solution such as Cisco Unity could be used to provide a single repository for a variety of messaging types. For example, a Microsoft Exchange message store could be used to consolidate the storage of fax transmissions, e-mail messages, and voice mail. Then a user could, for example, call into a Cisco Unity system and have her email read to her via text-to-speech conversion.
- Call center solutions: Cisco offers a variety of solutions for call centers. For example, Cisco’s Contact Center and Contact Center Express solutions can intelligently route incoming calls to appropriate call center agents. Also, because the call center would be using Cisco IP Phones, the phones can be geographically separated (for example, call center agents working from home).
- Security: If an attacker were to intercept and capture VoIP packets, he could potentially play them back to eavesdrop on a conversation. As another example, a user might enter her personal identification number (PIN) into a bank’s IVR system, and the attacker might capture those packets. Attackers might also introduce rogue devices (for example, IP phones or call agent servers) into the network. Fortunately, Cisco offers a variety of technologies and best practices for hardening the security of a VoIP network.
- Customer-facing solutions: Some customers might prefer to interact via a chat interface or e-mail, as opposed to talking with a company’s customer service representative. Because a VoIP network works over a data network, data network features such as chat and e-mail can be integrated into a customer’s selection of contact options, thereby increasing the customer’s level of satisfaction.
VoIP Network Components
Figure 9-2 illustrates components commonly found in a VoIP network. They are described in Table 9-2.
VoIP Protocols
To support communication among Cisco IP Phones, analog telephones, traditional PBXs, and the PSTN (as just a few examples), VoIP networks require a collection of protocols. Some protocols are signaling protocols (for example, H.323, MGCP, H.248, SIP, and SCCP) used to set up, maintain, and tear down a call. Other protocols are targeted at the actual voice packets (for example, RTP, SRTP, and RTCP) rather than signaling information. Table 9-3 describes some of the more common VoIP protocols.
Identifying Common Voice Vulnerabilities
Because IP phones are readily accessible and plentiful in many corporate environments, they become attractive targets for attackers. Also, VoIP administrators should be on guard against VoIP variations of spam and fishing (both common in e-mail environments), as well as toll fraud (common in PBX environments). This section details these common attack targets for a VoIP network.
Attacks Targeting Endpoints
Table 9-4 describes a few common VoIP attacks targeting endpoints.
VoIP Spam
Unless your e-mail account is well protected by a spam filter, you probably occasionally receive unsolicited e-mails. Spam can be an annoyance for an e-mail user. VoIP administrators should be aware of VoIP spam, more commonly called spam over IP telephony (SPIT). A SPIT attack on your Cisco IP Phone could, for example, make unsolicited messages periodically appear on a phone’s LCD screen or make the phone periodically ring, resulting in lost employee productivity. SPIT can also be used for fraud. For example, a SPIT attack could make incorrect caller ID information appear on your phone.
Unfortunately, common methods of mitigating e-mail spam are ineffective against SPIT. For example, a SPIT attack launched against your phone might cause your phone to ring every ten minutes. Although this behavior is certainly annoying and affects your productivity, the frequency of the calls is probably too low to be detected as malicious traffic (for example, by an Intrusion Prevention System [IPS] sensor).
However, modern Cisco IP Phones can be configured for authentication using Transport Layer Security (TLS). This approach allows a Cisco IP Phone to authorize any other device attempting to communicate directly with the phone. As a result, the nonauthenticated devices that source the SPIT would not be allowed to communicate with the Cisco IP Phone.
Vishing and Toll Fraud
The term phishing recently entered the technical vernacular. The basic concept of phishing is an attacker sending an e-mail to a user. The e-mail appears to be from a legitimate business. The user is asked to confirm her information by entering data on a web page, such as her social security number, bank or credit card account number, birth date, or mother’s maiden name. The attacker can then take this user-provided data and use it for fraudulent purposes.
Similar to phishing, the term vishing refers to maliciously collecting such information over the phone. Because many users tend to trust the security of a telephone versus the security
of the web, some users are more likely to provide confidential information over the telephone. User education is the most effective method to combat vishing attacks.
Another type of fraud committed against telephony systems is toll fraud. The basic concept of toll fraud is an attacker using a telephony system to place calls he should not be allowed to place. For example, a corporate telephony use policy might state that long distance personal calls are not allowed. If an employee ignored that directive and placed a personal long distance call, that would be a simple example of toll fraud.
More advanced forms of toll fraud involve taking advantage of a weakness in the telephony system to place calls. Cisco Unified Communications Manager includes several features that help combat toll fraud. For example, partitions and calling search spaces can be used to identify which phone numbers can be called from specific Cisco IP Phones. As another example, a Forced Authorization Code (FAC) could be used to require a user to enter a code to call a particular destination.
SIP Attack Targets
The previously described Session Initiation Protocol (SIP) is gaining rapid acceptance in mixed-vendor VoIP networks. One of the most attractive characteristics of SIP is its use of existing protocols. Also, by default, SIP messages are sent in plain text.
Unfortunately, the very characteristics that make SIP attractive can also be leveraged by attackers to compromise the security of a SIP network. For example, an attacker could launch a man-in-the-middle attack, in which the attacker convinces a router, phone, or SIP server to send SIP and/or RTP packets to the attacker’s PC. The attacker could then perform registration hijacking, which allows the attacker to intercept incoming calls and determine how those calls are routed.
Also, because SIP messages are transmitted in plain text by default, an attacker could manipulate the SIP messages. For example, the attacker could change the SIP addresses in the messages. This type of attack is known as message tampering.
Because SIP networks often rely on SIP servers (for example, SIP registrar, location, proxy, and/or redirect servers), an attacker could also launch a DoS attack against one of those servers. For example, if a DoS attack made a SIP registrar server unusable, new SIP phones would be unable to register with the network.
Cisco offers several solutions for combating such attempts to attack a SIP network. For example, a secure tunnel, such as IPsec, could be used to encrypt SIP messages traveling between routers. In fact, a Cisco Unified Communications Manager server could act as a peer in an IPsec tunnel. Also, a firewall or IPS sensor could be used to detect and mitigate common DoS attacks. Cisco Catalyst switches could be used to help prevent a man-in-themiddle attack (for example, using Dynamic ARP Inspection [DAI]).
Table 9-5 summarizes some of the attacks described in this section.
Securing a VoIP Network
Now that you have a foundational understanding of the myriad attacks that can target a VoIP network, this section addresses specific VoIP attack mitigations. Specifically, it covers separating voice traffic from data traffic using voice VLANs, using firewalls and VPNs to protect voice traffic, and approaches to harden the security of voice endpoints and servers.
Protecting a VoIP Network with Auxiliary VLANs
A fundamental approach to protecting voice traffic from attackers is to place it in a VLAN separated from data traffic. This voice VLAN is often called an auxiliary VLAN. VLAN separation alone protects voice traffic from a variety of Layer 2 attacks. For example, an attacker would be unable to launch a man-in-the-middle attack against an IP Phone, where the attacker’s MAC address claimed to be the MAC address of the IP Phone’s next-hop gateway. Such an attack would be mitigated, because the attacker’s PC would be connected to a data VLAN while the IP Phone was connected to the auxiliary VLAN.
Many models of Cisco IP Phones include an extra Ethernet port to which a PC can attach. The attached PC communicates through the Cisco IP Phone into a Cisco Catalyst switch at the access layer. Fortunately, the PC and the Cisco IP Phone can transmit traffic in separate VLANs (that is, a data VLAN for the PC’s traffic and an auxiliary VLAN for the phone’s voice traffic) while still connecting to a single Cisco Catalyst switch port.
In Figure 9-4, notice that the PC communicates on VLAN 110, while the Cisco IP Phone sends voice traffic on VLAN 210. Traffic from both VLANs enters Switch1 on port Gigabit 0/1. A single Cisco Catalyst switch port accommodating traffic from two VLANs might seem like a trunk port. Interestingly, Cisco makes an exception on many Cisco Catalyst switch models, allowing the port that accepts traffic from the data VLAN and the auxiliary VLAN to be an access port.
Protecting a VoIP Network with Security Appliances
Security appliances such as firewalls and VPN termination devices also can be used to protect voice networks. However, one challenge of protecting voice networks with a firewall is that the administrator is unsure what UDP ports will be used to transmit the RTP voice packets. For example, in a Cisco environment, a UDP port for an RTP stream typically is an even-numbered port selected from the range of 16,384 to 32,767. Opening this entire range of potential ports could open unnecessary security holes.
Fortunately, Cisco firewalls (that is, the PIX and Adaptive Security Appliance [ASA] firewalls) can dynamically inspect call setup protocol traffic (for example, H.323 traffic) to learn the UDP ports to be used for RTP flows. The firewall then temporarily opens those UDP ports for the duration of the RTP connection.
To understand this concept, consider Figure 9-5. In the first step, the Cisco IP Phone uses SCCP to initiate a call to the PSTN. SCCP, which uses TCP port 2000, is used to communicate between the Cisco IP Phone and the UCM server. UCM determines, based on the dialed digits, that the call needs to be sent out the H.323 gateway. In the second step, using TCP port 1720, UCM initiates a call setup with the H.323 gateway. The firewall between these devices is configured to permit the H.323 protocol. The firewall is also instructed to inspect H.323 traffic, to dynamically determine which UDP ports are selected for the voice path. In the third step, UDP ports 20,548 and 28,642 were randomly selected. Because an RTP flow is unidirectional, two UDP ports are selected to support bidirectional communication. Because the firewall inspected H.323 and dynamically learned the UDP ports to be used, the firewall permits the bidirectional RTP flow for the duration of the conversation.
Aside from permitting or denying specific ports, a firewall can also provide additional protection to a voice network. For example, a firewall can be configured to enforce specific policies, which might block specific phones. As another example, a firewall can determine if too many messages (as configured with a threshold value) of a certain type (for example, SIP requests) occur within a certain period of time.
Although many Cisco IP Phones can encrypt and authenticate traffic within the phone itself, many other IP telephony and VoIP devices lack this capability. To add encryption and authentication support for these devices, consider sending their voice packets over an IPsec-protected VPN tunnel. A variety of devices could be used for VPN termination, including Cisco Unified Communications Manager (version 5.0 and later). Figure 9-6 shows an IPsec tunnel encrypting traffic between a Cisco Unified Communications Manager server and an H.323 gateway.
Hardening Voice Endpoints and Application Servers
Endpoints, such as Cisco IP Phones, tend to be less protected than other strategic devices (for example, servers) in a voice network. Therefore, attackers often try to gain control of an endpoint and use that as a jumping-off point to attack other systems. An attacker might be able to gain control of a Cisco IP Phone by modifying the image or configuration file used by the phone.
Alternatively, the attacker could capture packets from the PC switch port on the Cisco IP Phone, or from the network (in a man-in-the-middle attack). Interestingly, an attacker could get the IP addresses of other servers (for example, DNS, DHCP, and TFTP servers) by simply pointing a web browser to the Cisco IP Phone’s IP address.
Figure 9-7 shows some of the information that can be gleaned by pointing a web browser to the IP address of a Cisco IP Phone. Access to this web page, made possible by the web access feature enabled on the phone (which is a default setting), does not require any login credentials. The information is freely available.
To help tighten the security of a Cisco IP Phone, beginning in Cisco Unified Communications Manager (UCM) 3.3(3), Cisco introduced support for phone image authentication, in which Cisco Manufacturing digitally signs the image file. As of UCM 4.0 and later, Cisco supports configuration file authentication in addition to image file authentication.
Several proactive steps can be taken by navigating to a phone’s configuration page within the Cisco Unified Communications Manager Administration interface, as shown in Figure 9-8, and changing some of the default settings.
Recall that a Cisco IP Phone makes a collection of configuration information freely available by pointing a web browser to the IP address of the Cisco IP Phone. This potential weakness can be mitigated by changing the Web Access parameter from Enabled to Disabled. Also, to prevent a man-in-the-middle attack, you could change the Gratuitous ARP setting from Enabled to Disabled. By disabling the gratuitous ARP feature, you are preventing a Cisco IP Phone from believing unsolicited Address Resolution Protocol (ARP) replies, which potentially could have come from an attacker claiming to be the next-hop gateway for the Cisco IP Phone.
Aside from voice endpoints, other popular attack targets on voice networks include application servers, such as a Cisco UCM server. Fortunately, Cisco provides an already hardened version of the operating system that runs on a UCM server.
NOTE In UCM versions 3.x and 4.x, the underlying operating system is Microsoft Windows 2000. In UCM versions 5.x and 6.x, the underlying operating system is based on RedHat Linux.
The UCM server’s operating system already has several unneeded services disabled. Depending on the UCM version, many default usernames (such as “root”) are disabled, and the Cisco Security Agent (CSA) Host-based Intrusion Prevention System (HIPS) product is installed. Also, beginning in UCM 5.0, UCM supports IPsec.
However, depending on the role of a particular application server, you might decide to turn off additional services that are unneeded. Therefore, by effectively combining a collection of security solutions Cisco makes available for IP telephony networks, you can prevent and/ or defeat the vast majority of attacks that could be launched against your voice network.
Summary of Voice Attack Mitigation Techniques
Table 9-6 summarizes some of the methods of mitigating attacks that were described in this section.