ICCP may stuck in “In process/Down” state when there’s some asymmetric design/configuration. Those possibility includes: Junos version mismatch on two PEs; BFD single-hop/multihop mismatch on two PEs
The following example is trying to build ICCP connection between R1 and R2. R1 is running on 12.3R9.4, R2 is running 13.3R1.8. The configuration of R1 is given. The configuration of R2 is almost the same except for the IP addresses.
R1--------------------------R2 labroot@r1> show configuration protocols iccp local-ip-addr 192.168.239.193; peer 192.168.239.194 { redundancy-group-id-list 1; liveness-detection { version 1; minimum-interval 1000; minimum-receive-interval 1000; multiplier 1; transmit-interval { minimum-interval 1000; } detection-time { threshold 2000; } } } labroot@r1> show configuration interfaces ae5 description MCLAG-INTERCHASSIS; flexible-vlan-tagging; aggregated-ether-options { minimum-links 1; link-speed 10g; lacp { active; periodic fast; system-priority 100; } } unit 0 { family bridge { interface-mode trunk; vlan-id-list [ 3001-3499 2517 ]; } } unit 1 { vlan-id 3000; family inet { address 192.168.239.193/30; } } labroot@r1> show configuration multi-chassis multi-chassis-protection 192.168.239.194 { interface ae5; } labroot@r1> show iccp Redundancy Group Information for peer 192.168.239.194 TCP Connection : In progress Liveliness Detection : Unknown Redundancy Group ID Status 1 Down Client Application: lacpd Redundancy Group IDs Joined: None Client Application: l2ald_iccpd_client Redundancy Group IDs Joined: None labroot@r1> show bfd session detail 0 sessions, 0 clients Cumulative transmit rate 0.0 pps, cumulative receive rate 0.0 pps
ICCP protocol relies on TCP negotiation and underlying BFD. If either of them can’t be established, ICCP will be stuck.
-For junos mismatch: There’s some minor design change in 12.3R5+ version, makes it incompatible with other major releases like 13.3, 14.1.
-For BFD mismatch: Single hop BFD is by default distributed that runs on Packet Forwarding Engine. Multi hop BFD is by default centralized that runs on Routing-Engine. In MC-LAG setup, Inter-Chassis Control Protocol (ICCP) uses BFD in multi hop mode. If one PE is running single hop, the other multi hop, BFD can’t be established, thus ICCP will not come up
-Though not strictly required, it is always recommended to have both PEs running on the same junos version
-In case junos version doesn’t match, we can change both PEs to single-hop mode as a workaround to bring up ICCP. “protocol iccp peer X.X.X.X liveness-detection single-hop”