convert

Converts flow records between supported formats. Can be used for filtering and cutting flow record files.

usage: python3 -m flow_models.convert [-h] [-i {csv_flow,pipe,nfcapd,binary}]
                                      [-o {csv_flow,binary,append,extend,none}]
                                      [-O OUTPUT] [--skip-in SKIP_IN]
                                      [--count-in COUNT_IN]
                                      [--skip-out SKIP_OUT]
                                      [--count-out COUNT_OUT]
                                      [--filter-expr FILTER_EXPR]
                                      in_files [in_files ...]

Positional Arguments

in_files

input files or directories

Named Arguments

-i, --in-format

Possible choices: csv_flow, pipe, nfcapd, binary

format of input files

Default: 'nfcapd'

-o, --out-format

Possible choices: csv_flow, binary, append, extend, none

format of output

Default: 'csv_flow'

-O, --output

file or directory for output

Default: '-'

--skip-in

number of flows to skip at the beginning of input

Default: 0

--count-in

limit for number of flows to read from input

--skip-out

number of flows to skip after filtering

Default: 0

--count-out

limit for number of flows to output after filtering

--filter-expr

expression of filter

To convert flow records between different formats, the input and output formats should be specified in command line. Moreover, the output file/directory should be given with -O parameter. When not, the standard output (-) is being used. Input files or directories should be specified as the positional argument.

Example: (reads flow records in binary format and outputs as csv lines to standard output)

flow_models.convert -i binary -o csv_flow -O - sorted

To filter flow records, the filter expressions should be specified. Filter expression should use the Python syntax. Bitwise (&, |, ~) operators should be used instead logical ones (and, or, not). The following fields are available:

af, prot, inif, outif, sa0, sa1, sa2, sa3, da0, da1, da2, da3, sp, dp, first, first_ms, last, last_ms, packets, octets, aggs

Example: (selects HTTPS protocol flows and writes them in binary format)

flow_models.convert -i binary -o binary -O https_only –filter-expr “(prot==6) & ((sp==443) | (dp==443))” sorted

Cutting of flow records can be done with skip_in, count_in, skip_out, count_out parameters. They specify how many flow records should be skipped (skip_in) and then read (count_in) from input and to be skipped (skip_out) and written (count_out) after filtering.

Example: (skips the first 100 records and writes the next 1000)

flow_models.convert -i binary -o binary -O sample –skip-in 100 –count-in 1000 sorted

When no filter is being used, usage of skip_in will give the same results as skip_out. The same applies for count_in and count_out respectively. However, depending on input format, usage of skip_in and count_in may result in a better performance than skip_out and count_out.

When both input and output formats are binary and no filter expression is being used, it will be more efficient to use the cut tool, which uses dd to cut binary record files.

Converting, filtering and cutting can be done simultaneously in a single command call.

The convert tool can be used to convert flow records between supported formats (nfcapd format and CSV and binary formats flow described in File formats).