Author Archives: ck129@duke.edu

SDN Presentation at Winter 2015 Common Solutions Group Meeting

One of the workshop topics at the Winter 2015 Common Solutions Group (CSG) – meeting held at University of California Berkeley from 1/14/15 – 1/16/15 was “Democratizing the Network with Software Defined Networking (SDN)” – details of the workshop are available here:

CSG Workshop

Charley Kneifel and Mark McCahill participated in the workshop and Charley Kneifel gave a presentation on “Dukes SDN Journey”

In addition to the presentation (link below), Mark McCahill gave a demonstration of SwitchBoard which is documented elsewhere on this site. Both Mark and Charley participated in a round table discussion on the topic of SDNs in general.

CSG – Winter 2015 – Duke’s SDN Journey

As always we acknowledge the support of the National Science Foundation –

This Work Supported by NSF on the following grants:

NSF OCI-1246042 – CC-NIE
NSF CNS 1243315 – EAGER

AL2S Network Setup and Testing Summary

In an earlier post I described the setup of the new v3.4 perfSONAR node used to test the AL2S network.  In this post I will summarize the results of the 1 week test done on the AL2S link to UChicago along with a description of the work that Victor did to setup the link.

Setup

The TelCom Arista has a PerfSonar box connected to it, on port 17.
The TelCom Arista *also* has a connection to AL2S, via MCNC, that arrives via port 1.
We share the connection through AL2S with RENCI, and RENCI has allocated us 100 VLAN tags to use (1000-1099, inclusive).

Last week, Victor allocated a 5 Gbps circuit for testing between Duke and Chicago.
It is using VLAN tag 1000 (on our end), and expired at 4 PM on Monday (10/20/14).
AL2S performs a tag flip somewhere in the link, so that our VLAN tag 1000 becomes VLAN tag 3000 at Chicago’s end.

The first thing we did was to test in traditional switch mode.
This meant:
1) Allocating VLAN 1000 on the TelCom Arista
2) Ensuring that port 1 was in trunk mode, and that tags 1000-1099 were allowed
3) Ensuring that port 17 was in access mode for tag 1000
4) Setting up a VLAN interface for tag 1000

We agreed with Chicago to use the private subnet 192.168.10.0/24 for testing.
The VLAN interface on the TelCom Arista was configured to use address 192.168.10.2; the Chicago switch was configured to use 192.168.10.1.
After confirming connectivity between the switches, the PerfSonar boxes at each end were configured with addresses 192.168.10.3 (Duke) and 192.168.10.4 (Chicago).

Once we had done a bit of testing in traditional switch mode, we chose to test the OpenFlow capability, using rest_router.
To set this up, Victor did the following:
1) Add port 1 to the OpenFlow instance on the TelCom Arista
2) Add port 17 to the OpenFlow instance on the TelCom Arista
3) Configure port 17 into trunk mode on the TelCom Arista; ports in access mode do not function in OpenFlow mode
4) Add a vlan tagged interface to the PerfSonar box hanging off port 17; this was accomplished thus:
a) ifconfig p2p1 0.0.0.0
b) modprobe 8021q
c) vconfig add p2p1 1000
d) ifconfig p2p1.1000 192.168.10.3 netmask 255.255.255.0 up mtu 9000
5) Add the private address range to rest_router’s management using curl, thus:
curl -X POST -d ‘{“address”:”192.168.10.254/24″}’ http://sdn-prod-01.oit.duke.edu:8080/router/0000001c7365b107/1000

At this point, packets were being switched onto the AL2S link, within the TelCom Arista, using OpenFlow rules.

Since roughly Tuesday afternoon, perfSonar running measurements against the peer perfSonar box at Chicago.
The bandwidth data has been pretty noisy, both with traditional switching and with OpenFlow.
We expect that there’s some bandwidth contention; and would be curious to see how the results look with bwctl being run in parallel mode within the web interface.
When bwctl is run manually (and explicitly specify the degree of parallelism), we get fairly consistent (and high) results that exceed the requested bandwidth of the link (typically very close to 10 Gbps –

Command Line BWCTL Results:

[root@tel-perfsonar-01 ~]# bwctl -c 192.168.10.4 -P8
bwctl: Using tool: iperf
bwctl: 36 seconds until test results available
RECEIVER START
------------------------------------------------------------
Server listening on TCP port 5239
Binding to local address 192.168.10.4
TCP window size: 87380 Byte (default)
------------------------------------------------------------
[ 15] local 192.168.10.4 port 5239 connected with 192.168.10.3 port 57265
[ 17] local 192.168.10.4 port 5239 connected with 192.168.10.3 port 49730
[ 16] local 192.168.10.4 port 5239 connected with 192.168.10.3 port 58667
[ 18] local 192.168.10.4 port 5239 connected with 192.168.10.3 port 53343
[ 19] local 192.168.10.4 port 5239 connected with 192.168.10.3 port 41485
[ 20] local 192.168.10.4 port 5239 connected with 192.168.10.3 port 55678
[ 21] local 192.168.10.4 port 5239 connected with 192.168.10.3 port 39032
[ 22] local 192.168.10.4 port 5239 connected with 192.168.10.3 port 39575

[ ID] Interval       Transfer     Bandwidth

[ 22]  0.0-10.0 sec  3123445760 Bytes  2488803881 bits/sec
[ 22] MSS size 8948 bytes (MTU 8988 bytes, unknown interface)
[ 17]  0.0-10.1 sec  1780350976 Bytes  1413963232 bits/sec
[ 17] MSS size 8948 bytes (MTU 8988 bytes, unknown interface)
[ 15]  0.0-10.1 sec  1054474240 Bytes  832376766 bits/sec
[ 15] MSS size 8948 bytes (MTU 8988 bytes, unknown interface)
[ 21]  0.0-10.1 sec  1462370304 Bytes  1154438086 bits/sec
[ 21] MSS size 8948 bytes (MTU 8988 bytes, unknown interface)
[ 18]  0.0-10.2 sec  1421213696 Bytes  1117543689 bits/sec
[ 18] MSS size 8948 bytes (MTU 8988 bytes, unknown interface)
[ 20]  0.0-10.2 sec  1135476736 Bytes  890163568 bits/sec
[ 20] MSS size 8948 bytes (MTU 8988 bytes, unknown interface)
[ 16]  0.0-10.2 sec  858259456 Bytes  672128900 bits/sec
[ 16] MSS size 8948 bytes (MTU 8988 bytes, unknown interface)
[ 19]  0.0-10.2 sec  930742272 Bytes  727474666 bits/sec
[ 19] MSS size 8948 bytes (MTU 8988 bytes, unknown interface)
[SUM]  0.0-10.2 sec  11766333440 Bytes  9196648461 bits/sec

Wed Oct 15 14:06:26 EDT 2014

Which shows about 9.2 Gbps which is great.

The full results for the bandwidth testing using the perfSONAR GUI is shown below:

perfSONAR results - tel-perfsonar-01 to 192-168-10-4 (Chicago) - 2014-10-21 - Final - Full Week

A couple of items to note –

  1. General variation of the bandwidth – inbound and outbound bandwidth over the link to the UChicago perfSONAR node varied broadly.
  2. Consistency of ping times – the ping times from tel-perfsonar-01 (192.168.10.3) to the UChicago node (192.168.10.4) where rock solid at 24.6 msec and did not vary when the Arista switch was in normal mode vs. OpenFlow mode.
  3. The lack of data between 4PM and 10 PM on Thursday 10/16/14 was due to Charley incorrectly setting the IP address on the primary network interface (p2p2) which Victor fixed late Thursday evening.

perfSONAR v3.4 Deployment

The default image for perfSONAR v3.4 from Internet2 (http://www.perfsonar.net/ download from http://software.internet2.edu/pS-Performance_Toolkit/ )  was used to rebuild tel-perfsonar-01.netcom.duke.edu.

This version of perfSONAR (released 10/7/14) supports using multiple NICs on the server and allows you to control which port is used for a given test.  Bandwidth testing can use NIC1 and latency testing can use NIC2 which allows the use of a single server for both tests.  This change will allow us to run perfSONAR in more locations on campus as we will not need to deploy perfSONAR nodes in pairs (one for latency and one for bandwidth).  We have not found that doing latency tests between nodes on campus is helpful, but latency measurement from off-campus hosts has proven to be useful.

The installation was done from CD and the netinstall method was used.  The “full install” did not work.  Installation was completed on Tuesday 10/14/14 around 1:30

perfSONAR Configuration

As noted above, the interface used for a test can be selected when a test is created:

Name the test:

perfSONAR configuration - 10-20-14 #1

Then you can choose the interface to use for the test from the drop-down menu:

perfSONAR configuration - 10-20-14 #2

In this case, the interface list on tel-perfsonar-01 shows all of the  interfaces on the box – the two built in 1G interfaces (em1 and em2) along with the the 10G interfaces (p2p1 and p2p2).  Note that the p2p1 interface also include the VLAN 1000 tagged interface that Victor defined for testing with UChicago and AL2S.  But that’s another post (Updated 10/21/14)

The interface for results display has also changed with version 3.4 – the old system provided a number of different screens to look at the various test results –

perfSONAR results - tel-perfsonar-01 to 192-168-10-4 (Chicago) - 2014-10-20 - Final

Note above the consolidation into a single graph of the bandwidth (axis on left) and the ping response times (axis on right) along with the line below which indicates when a test was run.

The graph above is for ping and bandwidth between tel-perfsonar-01 and a perfSONAR node mapped into the AL2S network by UChicago.  The ping times are very consistent but there is a lot of variation in the bandwidth.

The graph below is for bandwidth between tel-perfsonar-01 and tel-perfsonar-02.

perfSONAR results - tel-perfsonar-01 to tel-perfsonar-02 - 2014-10-20 - Final

Note the general consistency of the runs for bandwidth between those hosts.  THis test was configured to use the interface (p2p2) which is mapped to the production network connection:

perfSONAR configuration - 10-20-14 #3

 

 

 

Planned Updates to perfSONAR Topology

The perfSONAR nodes deployed across campus provide two primary sets of measurements – Latency and Bandwidth.  In our testing since the beginning of the year, we have gotten greater insights into the campus network from the bandwidth measurements than the latency measurements.  The latency measurements for wide area connections to Singapore (both the I2 and Duke services) and along the path to Singapore have been useful.  The latency measurements on campus often show “negative latency” due to the precision of the clock synchronization between the two servers.

Originally we had deployed pairs of perfSONAR nodes inside of departmental networks. We plan to re-deploy the boxes that are solely used for latency measurements to other places on the Duke network.  We also will reserve one of the Dell servers for ad hoc deployments to campus locations to help debug performance issues.

As noted earlier, we have seen value for latency measurements done outside of the Duke campus and will be keeping servers for latency measurements in Telcom and Fitz East.  We are also looking to confirm that we can use both NICs in the Dell servers to be able to simultaneously measure bandwidth and latency on the same box.  Note that this does require two 10G network connections in target data centers/buildings closets which should not be connected to over subscribed 10G network ports.

Ultimately we will end up with:

  1. Loaner/Test Server
  2. Bandwidth and Latency Measurement – Outside of all networks (edge/perimeter)
  3. Bandwidth and Latency Measurement – Campus Data Center
  4. Bandwidth and Latency Measurement – Inside network, outside of MPLS core
  5. Bandwidth and Latency Measurement – AL2S Network
  6. Bandwidth in Physics DC on Physics VRF
  7. Bandwidth in BioSci DC on Physics VRF
  8. Bandwidth in BioSci DC on Biology VRF
  9. Bandwidth in Library DC on Library VRF
  10. Bandwidth in DSCR on DSCR VRF

 

perfSONAR Topology

Using the CC-NIE funding, OIT has deployed 10 physical servers as perfSONAR (PS) around the campus.  These servers are generally deployed in pairs – one of which handles bandwidth measurements and one of which handles latency measurements.  In order to validate local switch performance, there has been some “mixing” of bandwidth and latency measurements in the infrastructure.

We plan to re-do our configuration using the latest PS build which should allow us to combine bandwidth and latency measurements on the same server using different NICs.

The current servers are:

Server Location VRF Primary Role Notes
tel-perfsonar-01.netcom.duke.edu 0200 Telcom Bldg No VRF
(Outside)
Latency CC-NIE Server
tel-perfsonar-02.netcom.duke.edu 0200 Telcom Bldg No VRF
(Outside)
Bandwidth CC-NIE Server
fitz-perfsonar-01.oit.duke.edu Fitz East DC DataCenter General Purpose OIT
Networking
fitz-perfsonar-02.oit.duke.edu Fitz East DC DataCenter Latency CC-NIE Server
fitz-perfsonar-03.oit.duke.edu Fitz East DC DataCenter Bandwidth CC-NIE Server
bio-perfsonar-01.biology.duke.edu BioSci DC Biology Latency CC-NIE Server
bio-perfsonar-02.biology.duke.edu BioSci DC Biology Bandwidth CC-NIE Server
phy-perfsonar-01.phy.duke.edu Physics DC Physics Latency CC-NIE Server
phy-perfsonar-02.phy.duke.edu Physics DC Physics Bandwidth CC-NIE Server
bio-perfsonar-03.phy.duke.edu BioSci DC Physics Latency CC-NIE Server
bio-perfsonar-04.phy.duke.edu BioSci DC Physics Bandwidth CC-NIE Server
singapore-perfsonar-01.oit.duke.edu Singapore DC Public Bandwidth VM
singapore-perfsonar-02.oit.duke.edu Singapore DC Private Bandwidth VM

The servers listed above give us the ability to test:

Traffic between servers in different buildings on the same VRF
Traffic between servers in the same building on different VRFs
Traffic in and out of the primary data center (Fitz East)
Traffic in and out of the edge of the network (Telcom Servers)
Traffic in and out of server in the Singapore DC

perfSONAR monitoring of campus network health

As noted in an earlier perfSONAR article we are using perfSONAR to monitor overall performance of connections to the campus core network and among nodes in different locations on the campus network.  In addition, we are using perfSONAR to identify problems with those connections.

For example, after enabling the 10Gbps connection of the BioSci building to the new core, we saw some strange behavior as shown in the graph below:

PS - fitz-perfsonar-03.oit to-from phy-pefsonar-02.phy - 2014-08-21

After the network upgrade was completed, one direction of traffic flow showed the expected improvements, but the reverse direction was poor.  The performance of the path from phy-perfsonar-02.phy.duke.edu -> fitz-perfsonar-03.oit.duke.edu did not improve as the Fitz East data center was migrated to the new core nor when the traffic was white listed at the IPS.  The problem was fixed in mid June when an interface on one of the core switches was found to have issues.  Traffic then greatly improved for the path from Physics to Fitz East and the traffic paths were symmetrical.  Traffic stayed consistent between the two paths until late July 2014 (7/23/14) when the traffic from Physics to Fitz East again showed a degradation when compared with the opposite direction.  On 8/12/14, updates to the connection of the core network to the Internet were made and performance again became consistent between inbound and outbound traffic between Fitz and Physics (detail shown below):

PS - fitz-perfsonar-03.oit to-from phy-pefsonar-02.phy - 2014-08-21 - DETAIL

We are instituting a regular review of a number of perfSONAR graphs by Duke’s Service Operations Center (SOC) in order to catch these issues early.  A simple ping of the path will work and so our normal monitoring services (link monitoring or ping monitoring) may not be sufficient to determine problems which only appear under load.  iPerf tests that are part of perfSONAR appear the best way to reliably monitor links for their usable and available bandwidth.

The perfSONAR monitoring of traffic between Fitz East and Physics seems to be a good barometer of the health of connections between the campus core and building networks.  A live view of this can be found at fitz-personar-03 (NetID login required).

perfSONAR Monitoring of OIT Network Upgrades

OIT has deployed a number of perfSONAR (PS) nodes around campus and have found that using PSfor bandwidth testing has identified a number of opportunities for bandwidth improvements as well as clearly showing the ongoing improvements to the Duke core network.

PS - fitz-perfsonar-03.oit to-from bio-pefsonar-02.biology - 2014-08-21

The initial data shows the performance of the network when connected via a 1 Gbps network connection to the original OIT core.  In mid-May, the network was migrated to the new core and connected at a speed of 10 Gbps.  However, at that time the Fitz East data center was not yet connected to the new core.  The reduction of network capacity from 1Gbps to 500 Mbps was tracked down and found to be related to the througput of a single stream through the Cisco Source Fire IPS devices used in the new core.   After whitelisting the traffic through the new IPS, the performance between the two servers was improved to about 1.5 Gbps although there was some asymmetry in the traffic rates between the two servers.  On June 9th, the Fitz East data center was moved to the new core network and bandwidth improved to a reliable 2.5 Gbps in each direction.  The only things in the path between the two servers were the SourceFire IPS (which was not inspecting the traffic) and a virtual firewall context.  It was surmised that the 2.5 Gbps limitation on bandwidth was due to the firewall.  The bio-perfsonar-02.biology.duke.edu server was taken off-line and migrated to a new IP address that was in the biology data center but not on the Biology VRF.  When the server was restored to service at the new IP address on July 9th, performance immediately improved to an expected 5-7 Gbps.

Similar data is shown below for the network connection between two servers on the same VRF but in different buildings.

Physics has servers in both the Physics Building as well as the BioSci building data center.  The graph below shows traffic flowing between phys-perfsonar-02.phy.duke.edu and bio-perfsonar-04.phys.duke.edu

PS - bio-perfsonar-04.phy to-from phy-pefsonar-02.phy - 2014-08-21

Both the Physics building and BioSci building were upgraded to 10Gbps and connected to the new core on May 15th (shown in more detail below).  The network between the two buildings immediately improved to delivery of 6-8 Gbps of bandwidth from the earlier limit of 1 Gbps.

PS - bio-perfsonar-04.phy to-from phy-pefsonar-02.phy - 2014-08-21 - DETAIL

It is important to remember that PS shows the available bandwidth, these graphs do not directly show how much is being used.

Single Rack SDN Environment

Attached is a PDF of a visio diagram of the single rack layout.

A couple of things about the diagram-

One blade per chassis is on the 10G “control network” – A1 and B1 to be used as controllers\ and C1 as the monitor/SNMP poller.

The blade chassis are connected via 1G to the control network as well (console/ssh access)

Initial configuration will not connect the data networks to the production network

Later testing needs to be done to run load over the OpenFlow VRF

In addition to our current ongoing activity around load testing (both traffic and rules) we need to start work on testing the general use cases.

Use Case #1 – Expressway #1

  • Normal traffic will flow over the A links between switches
  • Traffic from A2 to B2 will be routed over the B link between switch #1 and switch #2

This  will require a fairly simple rule – but it could be based on MAC address, source/dest IP, port, …

I think we should load up traffic on the A link and then show that we can independently load up traffic on the B path.  We should also plan to put rule set updates/stress on the servers as well.

Use Case #2 – Expressway #2A

This is similar to above but traffic has to flow through two switches

  • Traffic from A2 to C2 will flow over the B paths

Use Case #3 – Expressway #2B

This is similar to #1 – but the path is between switch #1 and switch #3 and bypasses switch #2

  • Traffic from A2 to C2 will flow over the C path

Visio-SDN Use Case Mapping – Single Rack – 12-04-13

Results Summary – perfSONAR tests – 10/28/13 – 10/31/13

I assembled a table summarizing the results for the non-overlapping tests as I am confused about several of them –

 

Path 1 Hop Path 2 Hop
SDN01->SDN07 8999.5 SDN01->SDN11 7771.6
SDN02->SDN05 7967.7 SDN02->SDN09 7685.8
SDN03->SDN06 7910.8 SDN03->SDN10 8406.4
SDN05->SDN01 9056.3
SDN05->SDN09 9041.7
SDN06->SDN02 9005.1
SDN06->SDN10 8978.1
SDN07->SDN03 8733.7
SDN07->SDN11 8705.8
SDN09->SDN07 7870.8 SDN09->SDN03 9073.2
SDN10->SDN05 7584.9 SDN10->SDN01 8119.0
SDN11->SDN06 8740.1 SDN11->SDN02 8006.4

So – the results that may  deserve a deeper look are –

SDN03->SDN06/SDN03->SDN10 and SDN09->SDN07/SDN09->SDN03

which both show the 2 hop results having higher performance than the single hop results.

The variability of the other results in 1 hop consistency vs. 2 hop consistency may also need to be looked at.  Look at the next post – the spread of the results as shown by the standard deviation does not appear to be big enough to cover the discrepancy – typically there were about 150 samples for each measurement

 

Path 1 Hop Path 2 Hop
SDN01->SDN07 8999.5 SDN01->SDN11 7771.6
SDN02->SDN05 7967.7 SDN02->SDN09 7685.8
SDN03->SDN06 7910.8 SDN03->SDN10 8406.4
SDN05->SDN01 9056.3
SDN05->SDN09 9041.7
SDN06->SDN02 9005.1
SDN06->SDN10 8978.1
SDN07->SDN03 8733.7
SDN07->SDN11 8705.8
SDN09->SDN07 7870.8 SDN09->SDN03 9073.2
SDN10->SDN05 7584.9 SDN10->SDN01 8119.0
SDN11->SDN06 8740.1 SDN11->SDN02 8006.4

perfSONAR – Averaged Results

For the perfSONAR bandwidth tests run Jianan from 10/28/13 to 10/31/13 – I have calculated the following averages and standard deviations.  I separated the results at a 5 Gbps (5000Mbps) level – results above 5000 are considered non-overlapping and results below 5000 are considered overlapping.

There are some surprising results – it appears there are a some cases where two hops is faster (on average) than one hop – but other results show that two hops is slower.  Spreadsheet with data is here: SDN perfSONAR Rates #2

SDN01->SDN07 (1) SDN01->SDN11 (1) SDN01->SDN07 (2) SDN01->SDN11 (2) Notes
Averages 2739.5 2817.2 8999.5 7771.6 SDN01->SDN07 – One Hop
Std Dev 176.2 255.0 116.4 274.2 SDN01->SDN11 – Two Hops
SDN02->SDN05 (1) SDN02->SDN09 (1) SDN02->SDN05 (2) SDN02->SDN09 (2)
Average 2734.3 2835.7 7967.7 7685.8 SDN02->SDN05 – One Hop
Std Dev 165.4 252.8 280.1 315.5 SDN02->SDN09 – Two Hops
SDN03->SDN06 (1) SDN03->SDN10 (1) SDN03->SDN06 (2) SDN03->SDN10 (2)
Average 4147.2 4184.7 7910.8 8406.4 SDN03->SDN06 – One Hop
Std Dev 188.1 211.7 224.8 432.1 SDN03->SDN10 – Two Hops
SDN05->SDN01 (1) SDN05->SDN09 (1) SDN05->SDN01 (2) SDN05->SDN09 (2)
Average 2937.5 3019.3 9056.3 9041.7 SDN05->SDN01 – One Hop
Std Dev 182.2 249.6 103.9 106.6 SDN05->SDN09 – One Hop
SDN06->SDN02 (1) SDN06->SDN10 (1) SDN06->SDN02 (2) SDN06->SDN10 (2)
Average 3022.1 3170.1 9005.1 8978.1 SDN06->SDN02 – One Hop
Std Dev 272.9 388.8 121.1 347.2 SDN06->SDN10 – One Hop
SDN07->SDN03 (1) SDN07->SDN11 (1) SDN07->SDN03 (2) SDN07->SDN11 (2)
Average 3682.3 3758.3 8733.7 8705.8 SDN07->SDN03 – One Hop
Std. Dev. 187.1 226.8 125.1 114.4 SDN07->SDN11 – One Hop
SDN09->SDN03 (1) SDN09->SDN07 (1) SDN09->SDN03 (2) SDN09->SDN07 (2)
Average 2754.4 2873.9 9073.2 7870.8 SDN09->SDN03 – Two Hops, One borderline result >5000Mbps
Std. Dev. 174.6 307.0 125.2 294.1 SDN09->SDN07 – One Hop
SDN10->SDN01 (1) SDN10->SDN05 (1) SDN10->SDN01 (2) SDN10->SDN05 (2)
Average 2708.0 2843.4 8119.0 7584.9 SDN10->SDN01 – Two Hops
Std. Dev. 175.4 280.0 339.7 336.5 SDN10->SDN05 – One Hop
SDN11->SDN02 (1) SDN11->SDN06 (1) SDN11->SDN02 (2) SDN11->SDN06 (2)
Average 4163.5 4160.7 8006.4 8740.1 SDN11->SDN02 – Two Hops
Std. Dev. 196.7 232.8 234.7 131.6 SDN11->SDN06 – One Hop
Units Mbps
SDNXX->SDNYY (1) – Denotes a result of <5000 Mbps – Overlapping Test
SDNXX->SDNYY (2) – Denotes a result of >5000 Mbps – Non Overlapping Test