flow_models.elephants package

flow_models.elephants.plot_entropy module

Generates plot of features entropy and (optionally) importances.

calculate_entropy(directory)

Calculate entropy of 5-tuple features for given flow records directory.

Parameters:

directory (os.PathLike) – binary flow records directory

Returns:

entropy for subsequent bytes and bits of (sa, da, sp, dp, prot) fields

Return type:

dict

flow_models.elephants.simulate_data module

Simulates flow table for elephant flow classification using flow records data directory.

simulate_data(directory, index=Ellipsis, mask=None, pps=None, fps=None, timeout=15, max_seconds=3600)

Simulate flow table occupancy reduction curve for given flow records.

Parameters:
  • directory (os.PathLike) – binary flow records directory

  • index (np.array, default Ellipsis) – index array of flows to use for simulating data

  • mask (np.array[bool], optional) – flows to add to flow table (elephants)

  • pps (float, optional) – packets per second, when None flow records times are used for calculation

  • fps (float, optional) – flows per second, when None flow records times are used for calculation

  • timeout (float, default 15.0) – inactive flow table timeout in seconds

  • max_seconds (int, default 3600) – total seconds number of simulation

Returns:

  • flows_sum (int) – sum of flows added to flow table

  • octets_sum (int) – sum of octets transmitted by flows while being in flow table

  • flows_slots (np.array) – number of flows present in flow table in each second

  • octets_slots (np.array) – amount of octets trasmitted by flows in flow table in each second