Network Traffic Analysis Using tcpdump Writing tcpdump Filters

Chia sẻ: Huy Hoang | Ngày: | Loại File: PDF | Số trang:39

0
65
lượt xem
12
download

Network Traffic Analysis Using tcpdump Writing tcpdump Filters

Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

While this section may be somewhat difficult to understand especially if you haven’t been exposed to this theory before, it is more than just an academic exercise. In order to comprehend network traffic at its most visceral level, you will have to understand tcpdump filters. Also, familiarity with tcpdump filters is necessary if you want to process tcpdump files for some trait. For instance, if you wanted to identify the beginning of a TCP connection, you would search for traffic with the SYN bit alone set....

Chủ đề:
Lưu

Nội dung Text: Network Traffic Analysis Using tcpdump Writing tcpdump Filters

  1. Network Traffic Analysis Using tcpdump Writing tcpdump Filters Judy Novak Johns Hopkins University Applied Physics Laboratory jhnovak@ix.netcom.com 1 All material Copyright  Novak, 2000, 2001. All rights reserved. 1
  2. Writing tcpdump Filters • Introduction to tcpdump • Writing tcpdump Filters • Examination of Datagram Fields • Beginning Analysis • Real World Examples • Step by Step Analysis 2 This page intentionally left blank. 2
  3. Objectives • Review the foundations to understand and create tcpdump filters including: • tcpdump filter format • Review of bit/byte theory • Review of binary/hexadecimal numbering systems • Review of bit masking • Learning to formulate tcpdump filters • Review of tcpdump output 3 tcpdump filters are necessary to selectively gather/read records of network traffic. While this section may be somewhat difficult to understand especially if you haven’t been exposed to this theory before, it is more than just an academic exercise. In order to comprehend network traffic at its most visceral level, you will have to understand tcpdump filters. Also, familiarity with tcpdump filters is necessary if you want to process tcpdump files for some trait. For instance, if you wanted to identify the beginning of a TCP connection, you would search for traffic with the SYN bit alone set. 3
  4. Foundations For Understanding tcpdump Filters • Specify item of interest for record selection • Any field in the IP datagram • Examples: header length or TCP flags • Variables for more commonly used fields: • Examples: “port” or “host” • Less common fields: • Identify protocol • Identify byte displacement • Examples: ip[0], tcp[13] 4 tcpdump filters need to specify an item of interest, a field in the IP datagram for record selection. Such items can be part of the IP header such as the IP header length, the TCP header such as TCP flags, the UDP header such as the destination port, or the ICMP message such as the message type. tcpdump provides a special name for each type of header. Much as you would expect, ip is used to denote a field in the IP header or data portion of the IP datagram, tcp for a field in the TCP header or segment, udp for the UDP header or UDP datagram, and ICMP for the ICMP message. For instance, ip[0] would indicate the first byte offset of the IP datagram which happens to be part of the IP header (remember counting starts at 0). tcp[13] would be the 13th byte offset into the TCP segment which is also part of the TCP header, and icmp[0] would be the first byte offset of the ICMP message which is the ICMP message type. Sample filters and reference material are found in: • tcpdump man pages 4
  5. Specifying Fields 0 15 16 ip[1] 31 4-bit 4-bit IP 8-bit TOS 16-bit total length (in bytes) version header length 16-bit IP identification number 3-bit 13-bit fragment offset flags 8-bit time to live 8-bit protocol 16-bit header checksum (TTL) 20 bytes 32-bit source IP address 32-bit destination IP address protocol[displacement] src host macro 5 Looking at the IP header as an example, we learn two ways to specify different fields. The easier way to specify a field of interest is by using a tcpdump macro. Not all fields have these macros. The source IP can be specified by combining two macros “src” and “host” to identify the field. But, if we want to look at the type of service field, we have to identify a protocol in which the field is found (IP because this is in the IP header) and a displacement in bytes (1) offset in the protocol. What are some of the more common macros used in filters? host select the record if either the source or destination host matches this IP net select the record if either the source or destination subnet matches This is useful if there are several IP’s from the same subnet of interest to you port select the record if either the source or destination port matches src host select the record if the source host matches dst host select the record if the destination host matches src net select the record if the source subnet matches dst net select the record if the destination subnet matches src port select the record if the source port matches dst port select the record if the destination port matches icmp select the record if the protocol field ip[9] has a value of 1 tcp select the record if the protocol field ip[9] has a value of 6 udp select the record if the protocol field ip[9] has a decimal value of 17 5
  6. The tcpdump Filter Format • The two different formats for a tcpdump filter are: • [offset: length] ip[9] = 1 tcp[2:2] < 20 udp[4:2] != 0 icmp[0] = 8 • port 23 dst host 1.2.3.4 src net 0 6 The first filter ip[9] = 1 selects any record with the IP protocol of 1 (ICMP). The second filter tcp[2:2] selects any record with a TCP destination port less than 20. The third filter udp[4:2] selects any UDP record with a non-zero UDP length. The fourth filter selects any record with an ICMP message type of 8, an ICMP echo request. The first variable filter selects any record with source/destination port of 23 (telnet). The second variable filter selects any record with destination host 1.2.3.4. The third variable filter selects any record with a source subnet of 0.x.x.x. 6
  7. Bit/Byte Fundamentals • A byte is an 8 bit field • It is possible to denote a span of bytes, i.e. udp[0:2] • Smallest precision that the tcpdump “language” offers is a byte • How do you reference bits within a byte? • Bit masking 7 First 4 bytes (bytes 0 - 3) of the IP header: BYTE 0 1 2 3 4 bit 4 bit 8 bit TOS 16 bit IP total length version length The bit is the smallest unit that can be represented by a computer - it can have a value of either 0 or 1. A byte is composed of 8 bits. Byte counting begins at byte 0; all successive bytes fall on these 8 bit boundaries. udp[0:2] specifies the byte in the UDP datagram beginning at byte 0 for a length of two bytes. Bit masking or using a combination of boolean arithmetic and binary/hexadecimal values will help “isolate” bits. 7
  8. Decimal/Binary Representations Base 10 Arithmetic - Decimal 102 101 100 2 6 5 = 2x100 + 6x10 + 5x1 = 265 Base 2 Arithmetic - Binary 27 26 25 24 23 22 21 20 1 0 0 0 0 0 0 1 = 1x128 + 1x1 = 129 128 64 32 16 8 4 2 1 8 Because decimal is our native number system, we really don’t have to do any conversions to understand the value of a number. But, if you examine the number, you realize that a digit has value based on its placement in the number. The digits that are least significant (to the right) have less value and those that are most significant (to the left) have the most value. Each digit is represented by an increasing power of the native base or base 10. The same theory applies when we are dealing with binary or base 2. Instead of using exponents of 10, we use exponents of 2 to figure out the decimal representation of the number. Also, because we are talking in terms of a byte, we use 8 bits or binary digits to represent a byte. So, we see above how we convert the binary number of 10000001 to a decimal 129. 8
  9. Binary/Hex Conversion Base 2 Arithmetic - Binary 27 26 25 24 23 22 21 20 1 0 0 0 0 0 0 1 = 1x128 + 1x1 = 129 4 binary bits represent one 128 64 32 16 8 4 2 1 hex character. 1000 0001 Base 16 Arithmetic - Hexadecimal binary is 81 hex. To denote . hex we use the 0x prefix - 23 22 21 20 23 22 21 20 0x81. 81 hex = 8x161 + 1x160 = 129 1 0 0 0 0 0 0 1 9 If you consider a byte as two hexadecimal characters, each character will be 4 bits long. So 16 different hex values can be represented - if all bits of a 4-bit chunk (nibble) are turned on or set to 1 the maximum value will be 15 (8 + 4 + 2 + 1). Counting in hex goes from 0 to 9, 10 = a, 11 = b, 12 = c, 13 = d, 14 =e, 15 = f. The leftmost bits are called the high-order bits - they have the most value, whereas the rightmost bits are referred to as the low-order bits. The same holds true for bytes; the left most are known as high-order bytes and right most are known as low-order bytes. Remember from arithmetic that any number with an exponent of 0 is 1. Terminology: Byte = 8 bits Nibble = 4 bits Hex char = 4 bits Word = 32 bits 9
  10. Hexadecimal Representation 23 22 21 20 23 22 21 20 (Hex) 0 0 0 0 = 0 1 0 0 0 = 8 0 0 0 1 = 1 1 0 0 1 = 9 0 0 1 0 = 2 1 0 1 0 = 10 (a) 0 0 1 1 = 3 1 0 1 1 = 11 (b) 0 1 0 0 = 4 1 1 0 0 = 12 (c) 0 1 0 1 = 5 1 1 0 1 = 13 (d) 0 1 1 0 = 6 1 1 1 0 = 14 (e) 0 1 1 1 = 7 1 1 1 1 = 15 (f) 10 When representing hexadecimal, we have a numbering system that goes from 1 to 15. The problem comes in representing values above 9 in a different scheme so that we can differentiate decimal and hexadecimal. A value of 10 decimal is a different value than 10 hexadecimal. A value of 10 hexadecimal has a value of 16 in decimal. So, when we get to values above 9, we use letters to represent 10 – 15 as you can see in the second column above. The letters in parentheses are the hexadecimal representations of the numbers in decimal. 10
  11. Figuring Out Decimal Values for Hex Output 1 Use reference to discover where fields start and end 2 Each character in the hex output is a power of 16 3 Start at the rightmost character and increase power of 16 4 Multiply by base number by exponent, add all values First 8 bytes of hexadecimal output of a UDP header 1 Source Port Dest Port Length Checksum 0089 0089 004c 1fd7 2 3 163 162 161 160 163 162 161 160 163 162 161 160 163 162 161 160 0 0 8 9 0 0 8 9 0 0 4 c 1 f d 7 44 8*161 + 9*160 = 128 + 9 = 137 11 When you see hexadecimal output and you need to translate it to some kind of coherent output, how do you start? Let’s assume that we are looking at a field or fields that have numeric values. In other words, we are not looking at a string payload. Let’s use 8 bytes of hexadecimal output from a UDP header to describe the process of figuring out the decimal values of all the fields. The first thing that you need to do is to identify what you are looking at. Most of the time when you look at hex output, it will be the entire datagram. In this case, for demonstration purposes, we will take an excerpt of the datagram. This is the first 8 bytes of the UDP header. You’ll need to use some reference, such as TCP/IP Illustrated, Volume1 by Richard Stevens or the references at the back of the course to identify the fields in the UDP header. Remember that each character that you see in the output is one hex character (4 bits) so there are 2 hex characters in a byte. You’ll discover that there is a 16-bit source port, a 16-bit destination port, a 16-bit UDP length and a 16-bit checksum in the UDP header. Coincidentally, these are all 2 byte fields – or 4 hex characters. You see that we divide up the hex output accordingly. Next, start with the rightmost hex character and label that with an exponent of 160. For each hex character associated with that field, move left and increase the power of 16 until you hit the leftmost character in the field. Then, multiply the base by the exponent above it and add all the values. Using the source port 0089 as an example, we start with the rightmost character and label it 160. Next, we only have one more character that is non-zero and we label that as 161. Now, we multiply the rightmost character 9 by 160 (anything to the 0 power is 1) and get a result of 9. Then we multiply the next character 8 by 161 (or 16) and get 128. Adding 128 and 9, we arrive at 137 which is the source port typically associated with NetBIOS name service queries. 11
  12. Your Turn These are the first two bytes of the IP header 4500 0030 Use the reference pages at the end of the course to figure out what the 16-bit total length is in decimal 12 Figure out the decimal value of the 16-bit total length. Use the reference materials at the end of this course to find a layout of the IP header and where the 16-bit total length falls in the IP header. Once you’ve discovered that field, use the methods discussed to figure out the decimal equivalent of the hex value. 12
  13. Answer TOS 16-bit total length IP version 163 162 161 160 4 5 0 0 0 0 3 0 3*161 = 48 IP header length Answer: 48 bytes in the IP datagram 13 The first thing we do is look at the layout for the IP header. The 16-bit total length field is found in the 2nd and 3rd bytes offset from the IP header (counting starts at 0). We find a value of 0030 in these 2 bytes. So, we methodically label all the the hex digits in this field as powers of 16 starting at the rightmost digit 0. Because we only have one non-zero value in the IP length field, we really only need to figure out its value. The non-zero value of 3 is located in the 161 position. So, we simply multiply 3*16 and discover that the IP length is 48 bytes. 13
  14. The Problem: Looking at Fields Less Than a Byte Layout of first byte 4 bit IP version 4 bit header length Current value in IP version 0 1 0 0 Desired value in IP version 0 0 0 0 14 We run into a slight problem when we deal with fields in an IP datagram that are less than a byte in length. The first byte of the IP header is actually two different fields – a 4 bit IP version and a 4 bit header length. If we use the protocol[displacement] notation, ip[0] finds both fields. What if we wanted to look at the 4 bit IP header length only and we were not interested in the 4 bit IP version? There is really no simple operation that is native to the tcpdump “language” that allows us to do this. But, we can do some operations and manipulations of fields and bits that will allow us to look at the 4 bit header length only. In essence, if we can zero out or change all the bits in the IP version field to 0, we really are looking at just the 4 bit header length if we look at ip[0]. How exactly do we discard or zero-out this high-order nibble and preserve the low-order nibble found in the 4 bit header length? This is what we will discuss next. 14
  15. More Fundamentals • Individual bit or a range of bits selected by bit masking • Uses the boolean AND operation to keep or discard a bit(s) • Two bits are AND’ed; the following values yield the following results BIT A AND BIT B = RESULT 0 0 0 1 0 0 0 1 0 1 1 1 15 We will use the boolean AND operation to help us zero-out unwanted bits. Let’s look at the fundamentals of applying this theory. Because we are dealing with computers that talk in binary, we consider taking every combination of the only two possible bit values - 0 and 1. As you can see from the truth table above, the only time the resulting value is 1 is when both bits that are AND’ed are 1. If you imagine “BIT A” as the bit found in the original byte and “BIT B” as a mask value used in an AND operation of “BIT A”, we can determine the appropriate mask value to either discard or preserve an original bit. 15
  16. Solution: “AND” Unwanted Bits With 0’s Current value in IP version 0 1 0 0 0 0 0 0 0 0 0 0 Resulting high order nibble value 16 The solution to dealing with fields that are less than a byte is basically to zero-out all other bits in the byte other than those we are interested in. In this instance, we want to “AND” the high-order 4 bits in the first byte of the IP header with zeros. This will yield zeros in the place where there once might have been non-zero values. 16
  17. Solution: “AND” Wanted Bits With 1’s Current value in IP header length 0 1 0 1 1 1 1 1 0 1 0 1 Resulting low order nibble value 17 Because we are dealing with an entire byte, we must also pay attention to the low- order nibble, the IP header length that we want to preserve. We must preserve the original value that we found there. We can’t simply ignore this field. In order to preserve the current value found in that field, we “AND” all bits with a value of 1. This will not change the current value found in that nibble. 17
  18. The Mask Byte Current value in first byte of IP header 0 1 0 0 0 1 0 1 Mask value 0 0 0 0 1 1 1 1 0000 1111 Hex - 0x0f 0 0 0 0 0 1 0 1 Resulting byte value 18 Ultimately, what you have to do is create a “mask” byte. This is a byte that will be AND’ed with the original value found in the first byte of the IP header to give us the desired resulting byte which will have the high-order nibble of all zeros and the low- order nibble as it was before the AND operation. So, this just means that our mask byte is 0000 1111 which translated to two hexadecimal characters of 0f. 18
  19. Putting it all Together Current value in first byte of IP header 0 1 0 0 0 1 0 1 Mask value 0 0 0 0 1 1 1 1 0000 1111 Hex - 0x0f Partial filter = ip[0] & 0x0f field AND mask 19 We figured out the mask that we want to AND with the first byte of the IP header, but how do we tell tcpdump how to do this? What we do is first identify the byte (or bytes) that we are dealing with by identifying what protocol we are dealing with (IP) and the displacement into the protocol that the byte is found (0 – first byte). Next, we use the “&” symbol to denote the AND operation and then we must tell it what value to AND it with. This is the mask value that we figured out or 0x0f in hexadecimal. 19
  20. And Your Point Would Be? First byte of IP header 4 bit version 4 bit length A 1 in a mask bit preserves a corresponding value bit, a 0 in a mask bit discards a corresponding value bit. 23 22 21 20 23 22 21 20 0 1 0 0 0 1 0 1 Current IP byte 0 fields, version = 4, length = 5 0 0 0 0 1 1 1 1 Mask value 0 0 0 0 0 1 0 1 Discards first 4 bits, preserves second 4 bits The mask would be 0x0f and the partial filter would be ip[0] & 0x0f. 20 Once the mask has been computed to figure out which bits to discard and which to preserve, it has to be “superimposed” over some byte or span of bytes. In this case we need to superimpose the mask over the entire first byte of the IP header because that is where the fields we are interested in lie. So, in this case that field is represented by ip[0]. The partial filter of superimposing the appropriate mask over the field of interest becomes ip[0] & 0x0f. A way to test whether an IP datagram has options is to test if the IP header length is greater than 5 (this is five 32 bit “words”- or 4 bytes). The filter then would become: ip[0] & 0x0f > 5 If this filter were included in the tcpdump statement with the proper notation or in a file and pointed to by the tcpdump option -F, all records read that had an IP header length of greater than 5 would be selected. What would the mask be to preserve the high order 4 bits (the version number) and discard the low 4 order bits (the length)? 0 1 0 0 0 1 0 1 AND __ ___ ___ ___ ___ ___ ___ ___ MASK?
Đồng bộ tài khoản