# Network Traffic Analysis Using tcpdump Introduction to tcpdump

The objectives of this course are to introduce you to the fundamentals and benefits of using tcpdump as a tool to analyze your network traffic. We'll start with introducing concepts and output of tcpdump. One of the most important aspects of using tcpdump is being able to write tcpdump filters to look for specific traffic. Filter writing is fairly basic unless you want to examine fields in an IP datagram that don't fall on byte boundaries. So, that is why an entire section is devoted to the art of writing filters.

1. Network Traffic Analysis Using tcpdump Introduction to tcpdump Judy Novak Johns Hopkins University Applied Physics Laboratory jhnovak@ix.netcom.com 1 All material Copyright  Novak, 2000, 2001. All rights reserved. 1
3. Course Objectives • Introduce the fundamentals of tcpdump • Explain how to write tcpdump filters • Examine fields in datagram for uses/misuses • Analyze traffic by placing it in categories • Demonstrate “real-world” analysis using tcpdump • Let you participate in the analysis process 3 The objectives of this course are to introduce you to the fundamentals and benefits of using tcpdump as a tool to analyze your network traffic. We’ll start with introducing concepts and output of tcpdump. One of the most important aspects of using tcpdump is being able to write tcpdump filters to look for specific traffic. Filter writing is fairly basic unless you want to examine fields in an IP datagram that don’t fall on byte boundaries. So, that is why an entire section is devoted to the art of writing filters. Before we start to use tcpdump to analyze traffic, we’ll examine many of the fields found in the IP datagram. This is done to familiarize you with those fields in theory and also how they might be used in practice. We’ll study how and why fields might be changed and for what purpose. Next, we’ll start the basic analysis process by looking at tcpdump output and categorizing the kind of traffic that you can see. Then, we’ll take a look at some real-world examples and of how tcpdump was used on monitored networks to discover what was happening. Next, the analysis process will be inspected step by step often with missteps to get you comfortable with it. As a note, all tcpdump output shown in this course is activity that actually occurred. Source and destination hosts/IP’s have been altered to obfuscate the true identities. 3
6. Objectives • Examine the strengths/weaknesses of tcpdump • Organize collection/analysis process of tcpdump data via Shadow • Examine tcpdump output • Standard • Hexadecimal • Length fields and how to convert them to bytes • Application layer • Interpretation of payload/hex output 6 This page intentionally left blank. 6
8. Strengths • Provides audit trail/historical record of network activity • Provides absolute fidelity • Universally available and used A 8 One of the most important parts of an arsenal in your security infrastructure is at least one tool or software package that captures an audit trail or a historical record of the traffic that enters or leaves your network. There will be times when you will be required to examine activity or connections that occurred in your network – not just traffic that caused an alarm to sound. For instance, what if you suspect that your packet filtering router that acts as your perimeter defense was acting strangely after some major network changes were made. You would have to examine the traffic that was allowed into your network to assist in determining the problem. That is where tcpdump is invaluable. Also, many tools - even logs from firewalls will display suspicious traffic, yet only partial data is displayed. What if you get a log of rejected traffic, but it doesn’t display or keep TCP flags? You’ll never know what kind of connection was attempted. tcpdump allows the analyst to examine all the bits and fields that are collected. If nothing is “wrong” with the connection, examination at the bit level is unnecessary. Yet, if you suspect something “foul” with the traffic, you really need access to all the data down to the bit level. And tcpdump is a tool that is universally used and very portable. If you become familiar with this software or its Windows counterpart, windump, it can be used on just about any platform to assist you in analysis of traffic. 8
9. Weaknesses • By default, doesn’t collect all the payload • Does not scale well on large networks • No idea of state • Limited operations • Do-it-yourself interpretations 9 tcpdump will capture 68 bytes of data from the network interface. Some of this data might be used for the link layer frame header. For Ethernet, 14 bytes of the data are used to capture fields like the source and destination MAC address, along with the type of embedded data. That leaves only 54 bytes to capture the IP header and embedded protocol header as well as any data. Most of the time this size will allow you to capture the IP header and embedded protocol header. But, sometimes protocol headers or data will be truncated. And, if you are interested in the data payload, tcpdump is really not the tool to use for this. tcpdump can collect a large volume of data for larger networks. This can be alleviated by not collecting all the data on the network – perhaps omit web traffic (port 80). Or, another way to deal with this is more disk space and faster processors to analyze all the collected data. But, at some point, the volume gets unwieldy. tcpdump blindly collects packet after packet. It has no idea of state or being able to know that a given packet is anomalous because it does not follow the flow of a normal connection. And while tcpdump has some primitive arithmetic operations or ways to manipulate bits, it cannot do complex operations for analyzing data. Finally, while it is an excellent way to collect data, tcpdump does not attempt to make interpretations of what it sees. It does have some integrity checking operations for certain data to make sure that the data is not irregular, but the analyst has to have the training and savvy to interpret the data. For the sophisticated analyst, this is a bonus because she or he can make the correct call. Compare this with a tool that is prone to false positives that gives no way of verifying the alarmed event. But, for an analyst who has little training, tcpdump can be daunting since it does not interpret events. 9
10. tcpdump Versions • tcpdump: Unix version; official current version 3.4 • ftp://ftp.ee.lbl.gov/tcpdump.tar.Z • ftp://ftp.ee.lbl.gov/libpcap.tar.Z • windump: Windows version • http://netgroup-serv.polito.it/windump • http://netgroup-serv.polito.it/winpcap • Collective effort; current version 3.5: www.tcpdump.org • tcpdump-3.5.tar.gz • libpcap-0.5.tar.gz 10 tcpdump is officially supported by the Lawrence Berkeley Labs. The current version is 3.4. There is an effort to improve tcpdump and patch known problems with tcpdump and libpcap that appears to be a collective effort of anyone interested. The software for this effort can be found at www.tcpdump.org. Their current version is 3.5 For the Unix versions of tcpdump, you need to download software known as libpcap that implements a portable framework for capturing low-level network traffic. windump is a Windows variant of tcpdump. It also requires an application program interface to collect the traffic known as winpcap. The unofficial version of tcpdump has some nice enhancements. It decrypts more of the applications at the application layer and has a very nice capability of converting hexadecimal payload to character output. 10
11. tcpdump in Action 0101001110 111010010011000 00100011011 packets Network tcpdump running on a host “sniffing” network packets tcpdump output 07:00:48.036746 ping.net > myhost.com: icmp: echo request (DF) 07:00:48.036776 myhost.com > ping.net: icmp: echo reply (DF) 07:02:12.622460 log.net.3155 > syslog.com.514: udp 101 07:03:01.132414 send.net.32938 > mail.com.25: S 248631:248631(0) win 8760 11 We see on this slide, a host running tcpdump and gathering records from the network interface. We see the records that tcpdump has collected below. tcpdump has a default standard output based on the protocol (TCP, UDP, ICMP) of the record that is displayed. While each of the various protocols has a similar format to the other, they are also distinct in what is displayed. By default, tcpdump will collect and print, in a standard format, all the traffic passing on the network. There are command line options for tcpdump that will alter the default behavior, either by collecting specified records, printing in a more verbose mode, printing in hexadecimal or writing records as “raw packets” to a file instead of printing as standard output. 11
12. Sample tcpdump Output Sample UDP Record 09:39:19.470000 nmap.edu.728 > dns.net.111: udp 56 timestamp source . port dest . port : protocol bytes Sample TCP Record beginning seq # data bytes 09:35:53.660000 nmap.edu.4 > dns.net.111: SF 136747297:136747297(0) win 1028 flags ending seq # 09:32:43.910000 nmap.edu.1171 > dns.net.139: S 2490962508:2490962508(0) win 512 09:32:43.910000 nmap.edu.1173 > dns.net.21: S 62697789:62697789(0) win 512 09:32:43.910000 nmap.edu.1193 > dns.net.22: S 1360146849:1360146849(0) win 512 09:32:43.920000 nmap.edu.1194 > dns.net.1114: S 372884098:372884098(0) win 512 12 Since we’ll review a lot of tcpdump output in this course, here’s a chance to get more comfortable with it. This is sample output from what appears to be an nmap scan; a popular and informative scan. All records have a timestamp. The sensor host (Redhat Linux 5.2) that captured these records has the precision to capture hundredths of seconds although tcpdump allows places for up to millionths. Different protocols will have different representations in tcpdump output. One of the first challenges is to identify the protocol (TCP, UDP, ICMP). Most will be labeled and while TCP isn’t explicitly labeled, it is the only one with flag bits, sequence and acknowledgment numbers to name a few. Some protocols like DNS will be interpreted at the application layer. Because of this, you may not see the normal clues that you are used to. It may not be obvious if it is UDP or TCP so it is important to look for clues as to which it is. In general, tcpdump gives details about the source/host > destination/host. Note that the bytes (0) transferred on SYN packets is normally 0 since they do not carry a payload because this is just part of establishing the three-way handshake. 12
16. Why Shadow? •  (free for all) • Tunable • Customize your own signatures • Change at will • Provides an audit trail of activity to/from network • Provides an intimate view of activity 16 While not the only reason to install and use Shadow, a very compelling reason is the price tag. In many cases, but not this one, you get what you pay for. Shadow is an excellent no-cost traffic analysis tool. Another benefit is that once you master Shadow, you can change it liberally at any time that you want. For instance, if you hear of a new exploit and can fashion a signature with a tcpdump filter, you can modify Shadow instantaneously. Compare this with some intrusion detection systems that do not offer the capability to change filters or signatures. You have to wait for the software company to update the filters when they get around to it and the updates may not include signatures that you would like to see. Also, since you get all the source code with Shadow, you can customize it for your whims and needs. This is highly unusual and allows you to make changes based on your proficiency of the software. Shadow uses tcpdump as its collection software. By default, you will collect most activity going into and out of your network. This can be very beneficial in providing an audit trail of activity in the network. If you ever find yourself in the midst of some kind of incident, this may be a very valuable attribute for an intrusion detection system to have. Finally, some of the more GUI kinds of intrusion detection systems do not allow the user to examine the actual traffic at the IP datagram level. Shadow, by virtue of tcpdump, will allow the user a very intimate view of the data collected. You will maintain fidelity of data and you can use all fields for interpretation and analysis. If the traffic you are analyzing is corrupted in some way, you want to be able to inspect the entire datagram. 16