Tuesday, March 24, 2020

Finally! It's easy to capture long term.

 "Long term packet capture" and "ease of use" are concepts that rarely go in the same sentence.


Usually, long term packet capture means one of two things. 1) Buy a glitzy, massively expensive, petabyte capture solution that streams a zillion bits per second to disk. 2) Use Wireshark or tshark to do a rolling capture buffer that captures a bunch of files with a dizzying amount of detail.


And then - there was IOTA.


Plug in, power up, press capture. Done.


With the IOTA, long term capture is finally easy - not just to capture, but to analyze as well. Within a few minutes of capturing packets on my home network, I was able to detect and troubleshoot a DNS problem that my wife has been experiencing with Amazon Prime Video for months! (My roku was doing a round robin on DNS and using an internal DNS address that it was not advertising).




Another problem with long term capture is the analysis part.


Oftentimes a long term capture means that we need to comb through a mountain of data, hoping to using the right combo of filters to find the root cause. The analysis part is another area where IOTA helps to make things easy.




The dashboards make packets readable. Protocols, utilization, and conversations are easy to select, sort, filter, and analyze. We can even pull the packets back to Wireshark with a click of a button.


I am really looking forward to using this tool in my analysis work and posting what I learn. For now, the first lesson I would like to share is how simple long-term capture can be. It doesn't have to come from a crazy-expensive platform, nor do we need to do use lossy hardware (laptops and span ports) to bring in the packet truth. This method of analysis is plug and play, getting you the detail you need to fix problems quickly.


Want to know more about Iota 1G and 10G - https://www.profitap.com/iota-1g/



Wednesday, March 4, 2020

Wireshark tshark Capture With Examples

Before I get into the tshark command syntax and other details, I want to chat about why you want to use tshark or any command line tool. Simply put, working from the command line allows a tremendous amount of consistency and flexibility.

Consistency

When you try to have someone perform your capture using the Wireshark GUI, there are many opportunities for errors and its just very time consuming. When you have the command line syntax figured out, you can put it in an email, batch file or document ensuring the client is doing exactly what you wanted. The added bonus is that working from, the command line is usually more responsive that remotely controlling a GUI over possibly slow links.

Flexibility

As I mentioned earlier, using the command line allows you to put the command in a batch file or document. This is incredibly useful if you wanted to schedule a capture, or if you wanted to configure a computer to automatically start capturing when it’s turned on. Other examples would be setting a desktop shortcut for the client to start a capture or kicking off a capture from a monitoring system that allows you to run a batch file when an error occurs.

In this video I will cover some of the common command line capture scenarios as well as determining what your index is and testing your commands.



Don’t assume anything when troubleshooting! (John Modlin)

 I was working at a large network heterogeneous environment and started working on a problem of scanners at field offices being unable to transfer documents across the WAN. Working on this problem led me quite down the rabbit hole, a black hole to be more specific.


The field techs had already changed out scanners, but the site continued to be intermittently unable to send documents across the WAN. Some documents transferred, some didn’t. After checking permissions and general settings on the scanner, I started looking at the network path.
I did a traceroute to the scanner to check all the hops. Things looked OK on the surface with normal responses coming back from each hop. Taking it a step further, I started checking each hop for its capability to pass fully loaded frames. To do this I used the following:Ping x.x.x.x -f -l 1472 (-f do not fragment, -l set payload size to 1472). Assuming the standard network MTU used is 1500 bytes, then 1472 would be the total amount of data that each hop should pass. This is because the Ethernet headers including 20 bytes for the IP header and 8 bytes for the ICMP header take up 28 bytes of the 1500 available, leaving you 1472 bytes for actual payload.As I started testing each hop, I hit one router in-between the edge and my location which would not pass 1472 bytes. I reduced the payload to 1460, Ping x.x.x.x -f -l 1460, and it passed. Moving it around until I found the cutoff at 1468 bytes, where 1468 bytes would pass, but 1469 would not. Taking a capture of the traffic at the scanner, I could see the scanner was setting the Do Not Fragment Bit in the IP Header by default, so when the scanner was sending fully loaded frames, 1469-1472 bytes, the packets were being dropped at that hop, effectively black-holing the traffic. If the scanner happened to send frames that were not fully loaded, less than 1469 bytes, the traffic would pass through the router, and the document would be received.
This revelation led to troubleshooting differently other applications that were having intermittent problems at field offices, by capturing traffic, checking the “Do Not Fragment” bit was set in the IP Header, and then checking the traffic path for black holes. Defined by RFC 791, in your capture file using Wireshark, you can look for ICMP Type 3 Code 4 errors, which indicate Black Hole detection also along a path. By correlating the issues to a geographical map, I found the hand-off between two telco companies in a large MPLS was dropping 4 bytes of traffic as it traversed from one telco to the other. This ended up being an expensive month-long fix of upgrading telco equipment at many locations, while resolving countless issues 40,000+ users were experiencing intermittently.

The moral of the story?

Telco’s main job is to provide transport.

Don’t assume anything when troubleshooting!

Popular post