convert
Converts flow records between supported formats. Can be used for filtering and cutting flow record files.
usage: python3 -m flow_models.convert [-h] [-i {csv_flow,pipe,nfcapd,binary}]
[-o {csv_flow,binary,append,extend,none}]
[-O OUTPUT] [--skip-in SKIP_IN]
[--count-in COUNT_IN]
[--skip-out SKIP_OUT]
[--count-out COUNT_OUT]
[--filter-expr FILTER_EXPR]
in_files [in_files ...]
Positional Arguments
- in_files
input files or directories
Named Arguments
- -i, --in-format
Possible choices: csv_flow, pipe, nfcapd, binary
format of input files
Default:
'nfcapd'- -o, --out-format
Possible choices: csv_flow, binary, append, extend, none
format of output
Default:
'csv_flow'- -O, --output
file or directory for output
Default:
'-'- --skip-in
number of flows to skip at the beginning of input
Default:
0- --count-in
limit for number of flows to read from input
- --skip-out
number of flows to skip after filtering
Default:
0- --count-out
limit for number of flows to output after filtering
- --filter-expr
expression of filter
To convert flow records between different formats, the input and output formats should be specified in command line. Moreover, the output file/directory should be given with -O parameter. When not, the standard output (-) is being used. Input files or directories should be specified as the positional argument.
Example: (reads flow records in binary format and outputs as csv lines to standard output)
flow_models.convert -i binary -o csv_flow -O - sorted
To filter flow records, the filter expressions should be specified. Filter expression should use the Python syntax. Bitwise (&, |, ~) operators should be used instead logical ones (and, or, not). The following fields are available:
af, prot, inif, outif, sa0, sa1, sa2, sa3, da0, da1, da2, da3, sp, dp, first, first_ms, last, last_ms, packets, octets, aggs
Example: (selects HTTPS protocol flows and writes them in binary format)
flow_models.convert -i binary -o binary -O https_only –filter-expr “(prot==6) & ((sp==443) | (dp==443))” sorted
Cutting of flow records can be done with skip_in, count_in, skip_out, count_out parameters. They specify how many flow records should be skipped (skip_in) and then read (count_in) from input and to be skipped (skip_out) and written (count_out) after filtering.
Example: (skips the first 100 records and writes the next 1000)
flow_models.convert -i binary -o binary -O sample –skip-in 100 –count-in 1000 sorted
When no filter is being used, usage of skip_in will give the same results as skip_out. The same applies for count_in and count_out respectively. However, depending on input format, usage of skip_in and count_in may result in a better performance than skip_out and count_out.
When both input and output formats are binary and no filter expression is being used, it will be more efficient to use the cut tool, which uses dd to cut binary record files.
Converting, filtering and cutting can be done simultaneously in a single command call.
The convert tool can be used to convert flow records between supported formats (nfcapd format and CSV and binary formats flow described in File formats).