Saturday, August 28, 2010

Decoding the RIPE BGP experiment

A lot of you probably saw your BGP routers go crazy on Friday 27th of August in the morning, especially if you happened to have a CRS (or another router running IOS-XR, like a C12k or ASR9k) in your (or a near) network.

RIPE and Duke University decided to experiment with Quagga's BGP and the result was to make some routers reset their BGP sessions, because they were receiving malformed BGP update packets. Malformed packets were generated by other routers in the middle, not by the Quagga BGP daemon where the experiment started.

Error messages generated on BGP routers that had peerings with affected (i.e. IOS-XR) routers, were like the following:


%BGP-3-NOTIFICATION: sent to neighbor x.x.x.x 3/1 (update malformed) 188 bytes F0630BB8 00000000 00000000 00000000 00
BGP: x.x.x.x Bad attributes FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0118
0200 0000 FD40 0101 0040 0206 0202 0D1C 316E 4003 04C3 10A1 6180 0404 0000 0000 4005 0400 0000 3CC0 081C 0D1C 0002 0D1C 0016 0D1C 0056 0D1C
01F7 0D1C 029A 0D1C 0813 FDE8 FDDE F063 0BB8 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 185D AF90


According to BGP's RFC, the "3/1" in the error message translates to Code "UPDATE Message Error" and Subcode "Malformed Attribute List".

Wireshark offers an "easy" way to decode packets in ASCII format, as long as you feed them in the right way. The following perl script (which is based on my previous ciscodump2text) will convert the BGP packet included in the above Cisco error messages into a format that can be understood by Wireshark's text2pcap.


#!/opt/perl/bin/perl
#
# bgpdump2text v0.1
#
# Convert BGP packets included in Cisco BGP notification error messages
# to a special text format that can then be fed into text2pcap
# so a pcap file for Wireshark can be created at the end.
# You have to remove any extra characters included in the error messages.
#
# Copyright (C) 2010 Tassos (http://ccie-in-3-months.blogspot.com)
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see http://www.gnu.org/licenses/.

@packets = ();
$first_line = 0;


while (<>) {
$line = $_;

if ( ( $line =~ /^FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF/ ) || ( $first_line == 1 ) ) {
$first_line = 1;

$line =~ s/\s//g;
$packets[1] .= $line;
}
}

for ($i = 1; $i <= @packets; $i++) {

if ( exists $packets[$i] ) {

for ( $j = 0; $j < length($packets[$i]); $j += 2 ) {
if ( $j == 0 ) {
printf "# BGP Packet $i\n%08X", $j/2;
} elsif ( $j % 32 == 0 ) {
printf " #\n%08X", $j/2;
}
print " ".substr($packets[$i], $j, 2);
}

print " #\n";
}
}

print "\n";


By using as input a text file with the BGP packet as shown in the original error message, you'll get an output text file ready to be processed by text2pcap.

The format the source text file (test-bgp.text) should have is the following.


FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0118 0200 0000 FD40 0101 0040 0206 0202 0D1C
316E 4003 04C3 10A1 6180 0404 0000 0000 4005 0400 0000 3CC0 081C 0D1C 0002 0D1C 001
6 0D1C 0056 0D1C 01F7 0D1C 029A 0D1C 0813 FDE8 FDDE F063 0BB8 0000 0000 0000 0000 00
00 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0
000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 000
0 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 185D AF90


You just need to erase all extra characters from the original error message and it's ready. The actual data starts after the "Bad attributes" string. Keep in mind that the packet might have been spitted in more than one messages, like in the above case. Just remove the initial characters from every line and it'll be ok.

Script is executed like below:

tassos$ bgpdump2text test-bgp.text > test-bgp.txt


The generated text file (test-bgp.txt) will have the following contents:


# BGP Packet 1
00000000 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF #
00000010 01 18 02 00 00 00 FD 40 01 01 00 40 02 06 02 02 #
00000020 0D 1C 31 6E 40 03 04 C3 10 A1 61 80 04 04 00 00 #
00000030 00 00 40 05 04 00 00 00 3C C0 08 1C 0D 1C 00 02 #
00000040 0D 1C 00 16 0D 1C 00 56 0D 1C 01 F7 0D 1C 02 9A #
00000050 0D 1C 08 13 FD E8 FD DE F0 63 0B B8 00 00 00 00 #
00000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
00000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
000000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
000000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
000000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
000000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
000000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
000000F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
00000100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #
00000110 00 00 00 00 18 5D AF 90 #


Note : right now, only one packet can be processed.

In our case (BGP data only) we need to add a L2 header, an IP header and a TCP header with values that resemble BGP, so we add the required parameters in the text2pcap program:


C:\Program Files\Wireshark>text2pcap.exe -T 179,1025 -d test-bgp.txt test-bgp.pcap
Input from: test-bgp.txt
Output to: test-bgp.pcap
Generate dummy Ethernet header: Protocol: 0x800
Generate dummy IP header: Protocol: 6
Generate dummy TCP header: Source port: 179. Dest port: 1025
Start new packet
Wrote packet of 280 bytes at 0

-------------------------
Read 1 potential packet, wrote 1 packet


Now, if you load this pcap file into Wireshark, you'll get the following output:



As you can clearly see, there was an unknown attribute (with Type Code 99) inserted into the UPDATE message with the following characteristics:

Flags: Optional, Transitive, Partial, Extended Length
Type code : 99
Length : 3000 bytes

The length of the BGP UPDATE message has been defined as 280 in the BGP header, having 253 as total path attribute length, so something went clearly wrong.

This unknown attribute should have a length of 3000 bytes as defined in its length attribute, but it was only 184 bytes if you count the octets from where zeros start (after 0x0BB8) till the end. From this number comes 188 (184 + 4 for Flags/TypeCode/Length), the number that's included in the initial error message.

So, the length of the unknown attribute has been defined as 3000 into the packet, which is 0x0BB8 in hex. If you somehow remove the first octet, then it becomes 0xB8, which is 184 in decimal. If you add the 4 extra bytes (Flags, Type code, Length), then it becomes 188 and the sum of all attributes becomes 253, which is the one shown in the packet too.

In reverse order, if you calculate the supposed total path attribute length in case all attributes were correct, then it should be 3069, which is 0x0BFD in hex. If again you somehow remove the first octet, it becomes 0xFD (253).

I guess, in general anything larger than 0xFF (255) would have caused the same issue too.

Cisco issued an advisory after some hours, providing fixes for its IOS-XR software.


Regarding the behavior of BGP, the relevant RFC (4271) says the following, so everything was expected:

NOTIFICATION messages are sent in response to errors or special conditions. If a connection encounters an error condition, a NOTIFICATION message is sent and the connection is closed.
...
A NOTIFICATION message is sent when an error condition is detected. The BGP connection is closed immediately after it is sent.
...
Error checking of an UPDATE message begins by examining the path attributes. If the Withdrawn Routes Length or Total Attribute Length is too large (i.e., if Withdrawn Routes Length + Total Attribute Length + 23 exceeds the message Length), then the Error Subcode MUST be set to Malformed Attribute List.


There is also a lot of discussion happening regarding the notification and reset thing after this event and draft-ietf-idr-optional-transitive seems quite interesting.

Links
http://mailman.nanog.org/pipermail/nanog/2010-August/024837.html
http://www.networkworld.com/news/2010/082710-research-experiment-disrupts-internet-for.html
http://www.renesys.com/blog/2010/08/house-of-cards.shtml
https://labs.ripe.net/Members/erik/ripe-ncc-and-duke-university-bgp-experiment/

Question
Is there a chance by creating "dummy" and large attributes to cause memory issues on BGP routers?

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Greece License.