Skip to main content
Matrix42 Self-Service Help Center

Architecting Edge VM and Network Flow Feeds

Developing an optimal architecture for a FireScope SDDM implementation

overview

This document is intended to aid in developing an optimal architecture for a FireScope SDDM implementation. As with any such guide, knowledge is evolving and this document will evolve as new information comes in. Additionally, every IT environment is unique, and therefore there may be exceptions to these rules based on your specific environment.

Types of Data Feeds

Netflow / sFlow / IPFix – These are statistical feeds of network traffic, with a row representing a distinct connection between two nodes.

This will include the following information:

  • Source IP
  • Destination IP
  • Source Port
  • Destination Port
  • Layer 3 Protocol
  • Count of Packets
  • Size of traffic, in bytes

Over a user configurable period (15 minutes, 1 hour, 2 hours, etc.), the Edge VM will consume these feeds from one or more sources. If a feed includes rows where the first five columns are a match, the data is consolidated into a single row and a counter is added for the number of times this was seen. When the period of collection is reached, the Edge VM will transmit this feed to the Application VIP (for SaaS implementations) via Rabbit MQ. When acknowledgment is received that all data has been received, the data will be overwritten by incoming feeds.

FireScope SDDM supports NetFlow v1/v5/v7/v8/v9, sFlow v2/v4/v5 and IPFIX.  All of this traffic will be UDP. Netflow and IPFIX will point to port 2100 on the FireScope Edge VM, sFlow will point to 6343.

Promiscuous Mode / SPAN / Port Mirroring – In this case, traffic is copied from one or more ports, one or more EtherChannels, or one or more VLANs and is sent to a FireScope Edge VM. The Edge VM then strips out all the information contained in a NetFlow row; if a URL is detected in the header of the packet, this will be extracted and appended to the data. The packet is then discarded; all of this takes place in memory.

If the packet is encrypted and the Edge VM does not have the key, then the packet contents cannot be read; however the header data which includes all of the information required by the solution is not encrypted and so this does not pose a problem for the solution (this was just added for the reader as an FYI).

For the majority of scenarios, Netflow/sFlow data is preferred. The one benefit to raw packet data is the ability to identify URLs. You do not have to choose one or the other, in fact best practice is to use a combination, with the majority of data coming in via NetFlow / sFlow.

Selecting Edge VM Deployments

A single FireScope Edge Virtual machine can handle massive amounts of flow data coming in, especially if this data is in the form of NetFlow feeds. Therefore, the primary factors that need to be considered are network access and bandwidth. An Edge will need to be able to receive NetFlow feeds over port 2100, sFlow feeds from 6343 and raw packet data will be coming over the original destination port. Additionally, the solution will need to be able to poll SNMP enabled devices such as load balancers, switches, routers over port 161.

Are there any subnets within the environment that cannot traverse this data? This is commonly found in PCI Exclusion Zones, DMZs or other similar segments. In these cases, allocate an Edge VM for this zone, it will be able to consume local flows and collect data, and then push directly to the FireScope Cloud for processing.

The next factor is bandwidth, are there any business locations with servers that contribute to services we anticipate mapping that have limited bandwidth connectivity to the central data center? We are not concerned with locations that just have workstations, the solution can see their incoming requests at the data center and that is the extent that we need from workstations. If we do have locations with limited bandwidth, the best practice would be to deploy an Edge VM at that location to push data directly to the FireScope Cloud.

Choosing Your Flow Feed Strategy

When deciding which flow feeds to use, we recommend starting with the virtual infrastructure, particularly since most organizations are now using virtualization in the majority of their infrastructure. We also recommend using a combination of feeds for best coverage.

Best Practice:

  • Send Netflow Feeds from VMware Virtual Distributed Switches (VDS), make sure to enable NetFlow from Distributed Port Groups as well.
  • Send Netflow/sFlow from Load Balancers;
  • Fill in gaps for physical servers by selectively collecting NetFlow/sFlow feeds from upstream switches. 
  • Finally, select one or two core switches between users and data center and send Mirrored Port / Network TAP in order to identify urls that users are requesting.  NetFlow/sFlow would also be an option here, but would not provide visibility into the urls users are requesting.

Virtual Infrastructure

We first tackle the virtual infrastructure not only because most organizations are heavily virtualized, but also due to scenarios where we have two or more virtual machines on the same physical host that contribute to the same service. In this scenario, the traffic is going over loopback adapter and may not be visible outside of this host. The following outlines the various options available for virtualized environments, depending on configuration.

Scenario Recommended Flow Configuration
Vmware, Virtual Distributed Switches in use Netflow configured from VDS pointing to a FireScope Edge VM within network reach.  Be sure to enable Netflow from the Uplink ports as well.
VMware, VDS not available Configure sFlow from Virtual Connect modules.
VMware, Cisco Nexus 1000v in use Promiscuous mode, requires an Edge VM on each physical ESX Host.
Microsoft Hyper-V sFlow to a FireScope Edge VM within network reach.

Load Balancers

Whenever possible it is best practice to feed NetFlow/sFlow data from load balancers in order to map the traffic from VIP to member IPs. Additionally, FireScope includes a load balancer discovery capability that will automatically map VIPs to members, using SNMP Polling.

Physical Servers / Cloud Computing

For the remaining physical servers in the environment, we have a number of options to choose from, depending on the number of servers left and the access level to the environment.

NetFlow / sFlow / SPAN / Port Mirroring from Upstream Switches – This maximizes coverage and allows us to see client requests calling the entry point of the service.

OS Collector – For scenarios where there is no access to the underlying network, such as compute resources running from a third-party cloud such as Azure, an OS Collector is available for Windows, Linux and Unix. This will use full packet capture that is then sent to an Edge VM.

What's Next

The article Key Steps for Deployment continues the instruction on preparing for a SDDM SaaS deployment.

  • Was this article helpful?