Codebase list nfdump / upstream/1.5.7
upstream/1.5.7

Tree @upstream/1.5.7 (Download .tar.gz)

This is a small description, what the nfdump tools do and how they work.
Nfdump is distributed under the BSD license - see BSD-license.txt

The nfdump tools collect and process netflow data on the command line. 
They are part of the NFSEN project which is explained more detailed at 
http://www.terena.nl/tech/task-forces/tf-csirt/meeting12/nfsen-Haag.pdf

The Web interface mentioned is not part of nfdump and is available at
http://nfsen.sourceforge.net

*NOTE*
This version no longer builds nfprofile by default!

nfdump tools overview:
----------------------

nfcapd - netflow collector daemon. 
Reads the netflow data from the network and stores the data into files.
Automatically rotate files every n minutes. ( typically ever 5 min )
nfcapd reads netflow v5, v7 and v9 flows transparently. 
You need one nfcapd process for each netflow stream.

nfdump - netflow dump.
Reads the netflow data from the files stored by nfcapd. It's syntax is
similar to tcpdump. If you like tcpdump you will like nfdump.
Displays netflow data and creates top N statistics of flow, bytes, 
packets and IP addresses.

nfreplay - netflow replay
Reads the netflow data from the files stored by nfcapd and sends it
over the network to another host.

nfexpire - expire old netflow data
Manages data expiration. Sets appropriate limits.

Optional binaries:

nfprofile - netflow profiler. Required by NfSen
Reads the netflow data from the files stored by nfcapd. Filters the 
netflow data according to the specified filter sets ( profiles ) and
stores the filtered data into files for later use. 

ft2nfdump - read flow-tools format - Optional tool
ft2nfdump acts as a pipe converter for flow-tools data. It allows
to read any flow-tools data and process and save it in nfdump format.

sfcapd - sflow collector daemon
scfapd collects sflow data and stores it into nfcapd comaptible files.
"sfcapd includes sFlow(TM), freely available from http://www.inmon.com/".

Note for sflow users:
sfcapd and nfcapd can be used concurrently to collect netflow and sflow
data at the same time. Generic command line options apply to both 
collectors likewise. Due to lack of availability of sflow devices,
I could not test the correct output of IPv6 records. Users are requested 
to send feedback to the list or directly to me. As of this first version,
sfcapd supports the same fields as nfcapd does for netflow v9, which is a 
subset of all available sflow fields in an sflow record. More fields will
be integrated in future versions of sfcapd.

Converting current nfcapd flat directory layout to any sub hierarchy
layout:
If you switch from flat directory layout to any sub directory hierarchy, 
the helper script CreateSubHierarchy.pl supports you in creating the desired
sub directory structure and moves already existing nfcapd files into the 
new layout. Use the same -S option for CreateSubHierarchy.pl as you will
use with nfcapd.

Compression
-----------
As of nfdump 1.5.6, the binary data files can optionally be compressed 
using the fast LZO1X-1 compression. For more details on this algorithm 
see, http://www.oberhumer.com/opensource/lzo. LZO1X-1 is very fast, so
that compression can be used in real time by the collector. LZO1X-1
reduces the file size around 50%. You can check the compression speed
for your system by doing ./nftest <path/to/an/existing/netflow/file>. 


Principle of Operation:
-----------------------
The goal of the design is to able to analyze netflow data from
the past as well as to track interesting traffic patterns 
continuously. The amount of time back in the past is limited only
by the disk storage available for all the netflow data. The tools
are optimized for speed for efficient filtering. The filter rules
should look familiar to the syntax of tcpdump ( pcap compatible ).

All data is stored to disk, before it gets analyzed. This separates
the process of storing and analyzing the data. 

The data is organized in a time based fashion. Every n minutes
- typically 5 min - nfcapd rotates and renames the output file
with the timestamp nfcapd.YYYYMMddhhmm of the interval e.g. 
nfcapd.200407110845 contains data from July 11th 2004 08:45 onward.
Based on a 5min time interval, this results in 288 files per day.

Analyzing the data can be done for a single file, or by concatenating
several files for a single output. The output is either ASCII text
or binary data, when saved into a file, ready to be processed again
with the same tools.

You may have several netflow sources - let's say 'router1' 'router2'
and so on. The data is organized as follows:

/flow_base_dir/router1
/flow_base_dir/router2

which means router1 and router2 are subdirs of the flow_base_dir.
For each of the netflow sources you have to start an nfcpad process:

nfcapd -w -D -l /flow_base_dir/router1 -p 23456
nfcapd -w -D -l /flow_base_dir/router2 -p 23457

Security: none of the tools requires root privileges, unless you have
a port < 1024. However, there is no access control mechanism in nfcapd.
It is assumed, that host level security is in place to filter the 
proper IP addresses.

See the manual pages or use the -h switch for details on using 
each of the programs.  For any questions send email to haag@switch.ch

Configure your router to export netflow. See the relevant documentation
for your model. 

A generic Cisco sample configuration enabling NetFlow on an interface:

	interface fastethernet 0/0
	ip route-cache flow

To tell the router where to send the NetFlow data, enter the following 
global configuration command:

	ip flow-export <ip-address> <udp-port>
	ip flow-export version 5 

	ip flow-cache timeout active 5

This breaks up long-lived flows into 5-minute segments. You can choose 
any number of minutes between 1 and 60;

See the relevant documentation for a full description of netflow commands

Note: Netflow version v5 and v7 have 32 bit counter values. The number of
packets or bytes may overflow this value, within the flow-cache timeout
on very busy routers. To prevent overflow, you may consider to reduce the 
flow-cache timeout to lower values. All nfdump tools use 64 bit counters 
internally, which means, all aggregated values are correctly reported.

The binary format of the data files is netflow version independant.
For speed reasons the binary format is machine architecture dependent, and 
as such can not be exchanged between little and big endian systems.
Internally nfdump does all processing IP protocol independant, which means
everything works for IPv4 as well as IPv6 addresses.
See the nfdump(1) man page for details. 

netflow version 9:
Even if netflow v9 is support, not all in netflow v9  defined  elements
are store in the data files. As of version 1.5 nfdump supports the fol-
lowing fields:
    NF9_LAST_SWITCHED
    NF9_FIRST_SWITCHED
    NF9_IN_BYTES
    NF9_IN_PACKETS
    NF9_FLOWS
    NF9_IN_PROTOCOL
    NF9_SRC_TOS
    NF9_TCP_FLAGS
    NF9_IPV4_SRC_ADDR
    NF9_IPV6_SRC_ADDR
    NF9_IPV4_DST_ADDR
    NF9_IPV6_DST_ADDR
    NF9_L4_SRC_PORT
    NF9_L4_DST_PORT
    NF9_INPUT_SNMP
    NF9_OUTPUT_SNMP
    NF9_SRC_AS
    NF9_DST_AS
32 and 64 bit counters are supported for Bytes and Packets. More
fields may be supported in future.

nfcapd can listen on IPv6 or IPv4. Furthermore multicast is supported.

Flow-tools compatibility
------------------------
When building with configure option --enable-ftconv, the flow-tools converter
is included. Using this converter, any flow-tools created data can be read
and processed and stored by nfdump.

Example:

	flow-cat [options] | ft2nfdump | nfdump [options]


See the INSTALL file for installation details.