Learning eBPF Reading Notes

I read Learning eBPF, so I’d like to leave some reading notes. This book was released last year and I’ve been wanting to read it for a while. lizrice/learning-ebpf - github.com has abundant sample code for reference. I’ll just note things that caught my attention personally, without paying much attention to context. BCC It starts with an example using BCC. Using bpf_trace_printk() allows you to output text to the pseudo-file /sys/kernel/debug/tracing/trace_pipe....

December 5, 2024

Using SRv6 with OVS

OVS (Open vSwitch) 3.2 added support for SRv61, so let’s try using it. To put it simply, you create a port with type=srv6 as shown below. Since it’s implemented using the same framework as existing tunneling protocols like VXLAN and Geneve, you specify both tunnel endpoints (corresponding to SIDs on both sides in SRv6) with options:remote_ip and options:local_ip. In addition to these, SRv6 has a special option options:srv6_segs to set intermediate routers as a Segment List....

March 29, 2024

Running SRv6 L3VPN with Mininet

I conducted an experiment to run SRv6 L3VPN in Mininet 1. The script is here. I was able to run it with a small configuration as shown in the diagram, so I’d like to introduce it here. Two routers (r1 and r2) are responsible for Encap/Decap with SRv6, and they exchange L3VPN information between r1 and r2 via eBGP. r1 and r2 each have two VRFs (vrf10 and vrf20), and tenants are separated by VRF (Tenant10 and Tenant20)....

July 7, 2023

Created a TRex Wrapper

First, let me introduce TRex. TRex 1 is a software-implemented traffic generator that supports two modes: Stateful/Stateless. Stateless is a mode for generating packet sequences to a stateless target DUT (Device Under Test), and can be used as a performance measurement tool for switching and routing. Although TRex has many features, personally I often need to generate simple TCP/IP packet sequences while changing their size, so I created autotrex 2 as a wrapper for TRex....

June 9, 2023

Background and Usage of Open vSwitch AF_XDP

Recently, I heard that OVS (Open vSwitch) has added support for AF_XDP, so I investigated what background led to this and how to use it. OVS is composed of kernel modules and userspace processes. With this architecture, the following issues became apparent1, and recently the implementation using AF_XDP is being promoted as a replacement. Modifications that require kernel updates or system-wide restarts Influenced by kernel developer policies and implementations Performance lags behind DPDK Too many backports Sometimes loses distribution support All of these seem to be valid reasons to push forward with architectural changes....

December 7, 2021

Reading Notes: Understanding Linux Network Internal Parts 1-2

This blog has covered relatively new networking technologies, but it’s good to revisit the fundamentals. The Linux kernel will be used for a long time to come, and even without perfect understanding, having a grasp of the basics is meaningful. This book is over 1,000 pages and consists of parts 1-7, so maintaining motivation to read it all at once is difficult. In this article, I’ll summarize what I’ve read so far....

August 26, 2021

Cloud Native Data Center Networking Reading Notes

I skimmed through only the parts that interested me. It’s well organized and a book I’d like to keep on hand. Throughout the book, the importance of KISS (Keep it simple, stupid) seems to be emphasized. Physical Aspects Don’t connect multiple links between Leaf and Spine. From a routing perspective, when a link fails, the same amount of traffic as before the failure will flow through the other live link. Since a certain link has failed, the expected bandwidth cannot be secured and performance degrades....

May 28, 2021

EVPN in the Data Center Reading Notes

I read EVPN in the Data Center, so I’m leaving some notes here. You can download the PDF for free from the NVIDIA page by registering your email address and other information. These are just personal notes during investigation, so they may contain errors. Introduction How should we deploy applications that assume L2 on an L3 network configured with a Clos topology? For example, applications that use L2 multicast or broadcast for health monitoring and member discovery fall into this category....

May 12, 2021

Running FRR (BGP Unnumbered) on Mininet

Introducing https://github.com/bobuhiro11/mininetlab. Using Mininet, you can run several switches and hosts on a single machine. I’ll use this to virtually launch two hosts and connect them via BGP (Unnumbered) included in the FRR package. In Mininet, you can describe the topology and command execution on each host in Python as follows. Although FRR may seem complex at first glance, if you properly place the three files daemons, vtysh.conf, and frr.conf, you can easily start it with frrinit....

May 8, 2021

Running BGP in Data Centers at Scale

I read Facebook’s paper 1 from NSDI 2021 and tried to summarize it in my own words without fear of mistakes. Overview This is a summary of Facebook’s BGP-based data center network. It covers practical content such as AS number allocation, route aggregation, BGP policies, testing and deployment methods for custom BGP implementations. The latter part also touches on actual incidents experienced over the past two years of operation. 1 Introduction Facebook operates several data centers, and these data centers have a common AS number allocation schema....

April 21, 2021

Use of BGP for Routing in Large-Scale Data Centers

I read RFC7938 so I’m taking notes. Since I only touched on it briefly, my terminology may be inappropriate. Overview Summarizing simple and highly stable network design methods for large-scale infrastructure with over 100,000 servers. BGP is adopted as the only routing protocol. Introduction I want to operate the infrastructure of large-scale distributed systems like web search engines with a simple and highly stable network with a small number of people....

April 13, 2021

Linux Kernel Implementation of SRv6

What is SRv6 SRv6 is an extension of IPv6 that implements Source Routing. Source Routing means that the data sender specifies not only the destination but also the route. Nodes to pass through are identified by SIDs (Segment Identifiers), and the route can be freely controlled by including the list in the packet header. In SRv6, an IPv6 address corresponds to a SID. The specification of SRv6 is being developed mainly by IETF (Internet Engineering Task Force) 1....

January 17, 2021

mTLS (Mutual TLS) Notes

What is mTLS It’s called mutual TLS or TLS mutual authentication. I read a well-organized article 1, so I’m not confident I can express it accurately, but I’ll take notes in my own words. TLS is a protocol for encryption used when performing some kind of communication over a network. It’s used for web browsing, email, Voice over IP, etc. Especially in web browsing, it’s familiar as a lock icon displayed on the left side of the address bar....

January 12, 2021

RoCE v2 Notes

I looked into the operation of RoCE v2 1, a mechanism that realizes RDMA over Ethernet, within Microsoft. Introduction When it comes to RDMA, I had the image of Infiniband, but recently iWARP and RoCE are also candidates. RoCE stands for Remote Direct Memory Access over Converged Ethernet. Remote Direct Memory Access is a mechanism that can read and write the main memory of a remote node without going through the CPU....

November 25, 2020

vDPA (Virtio Data Path Acceleration) Notes

A method to achieve high-performance (NIC wirespeed) and flexible I/O in virtual machine and container environments. Not much Japanese information is available yet. I haven’t actually tried it, so there may be misunderstandings. vDPA Kernel Framework In March 2020, the vDPA kernel framework was merged into Linux 5.7. A vDPA device handled by the vDPA kernel framework refers to a device where the data plane follows the virtio specification and the control plane is vendor-specific....

October 27, 2020

VMware Complete Introduction - Reading Notes

Miscellaneous Notes vSphere environment interfaces GUI: vSphere Client, vSphere Web Client CLI: vSphere CLI, VMware vSphere Power CLI DCUI (Direct Console User Interface): ESXi console Virtual disks are created as a single file called .vmdk and stored in the datastore. For VLAN VST configuration, Port Group = VLAN configuration is common and recommended. vSphere DRS automatically performs vMotion for the purpose of load balancing between ESXi hosts. vMotion is so-called live migration....

October 15, 2020

XDP Notes (Architecture, Performance, Use Cases)

Introduction I read “The eXpress data path: fast programmable packet processing in the operating system kernel” 1. This article is mostly based on this paper, with some references to news articles. The popularity of eBPF/XDP can be felt from the GitHub star counts of projects using eBPF/XDP, such as BCC, bpftrace, Facebook Katran, and Cloudflare Gatebot. eBPF/XDP has a powerful advantage: it can achieve high-speed packet processing as a kernel mechanism without depending on special hardware or software....

September 17, 2020

Monitoring ER-X with Prometheus

I purchased Ubiquiti Networks’ router EdgeRouter X (ER-X) a few years ago but left it unused, so this time I tried using it as an L2 switch. Since ER-X runs on a Debian-based OS, monitoring programs like Node exporter for Prometheus can be used. Prometheus itself was run on a Raspberry Pi. I wanted to run ElastiFlow too, but it seemed too demanding for the Raspberry Pi’s specs. Installing Prometheus and Grafana Install Prometheus and Grafana on the Raspberry Pi....

August 10, 2020

Using TP-Link Archer T3U AC1300 on Linux

I managed to get the USB wireless LAN adapter TP-Link Archer T3U AC1300 working on a Thinkpad X1 Carbon 2015 (Fedora release 29 (Twenty Nine), linux 5.3.6-100.fc29.x86_64), so I’m documenting the procedure. # The official website didn't distribute drivers, so I used rtl88x2bu git clone https://github.com/cilynx/rtl88x2bu cd rtl88x2bu/ git rev-parse HEAD # 962cd6b1660d3dae996f0bde545f641372c28e12 VER=$(sed -n 's/\PACKAGE_VERSION="\(.*\)"/\1/p' dkms.conf) sudo rsync -rvhP ./ /usr/src/rtl88x2bu-${VER} # Move to /usr/src/<module_name>_<module_version> and manage with dkms sudo dkms add -m rtl88x2bu -v ${VER} sudo dkms build -m rtl88x2bu -v ${VER} sudo dkms install -m rtl88x2bu -v ${VER} sudo dkms status sudo modprobe 88x2bu When connecting two types of adapters (onboard adapter and TP-Link adapter) to the same network, priority is determined by metric....

November 9, 2019

TCP Technology Introduction - Reading Notes

I read TCP Technology Introduction, so I’ll take some notes on the interesting parts, although they’re rather miscellaneous. First, the first half. I’ve heard about network-related topics, but I often forget them if I don’t use them. Chapter 1: Introduction to TCP Transmission efficiency Ethernet frame 1500 bytes, TCP header 60 bytes, IP header 20 bytes Applications can use 1420 bytes Considering Ethernet header 14 bytes and FCS (Frame Check Sequence) 4 bytes, transmission efficiency is $1420/(1500+18) \times 100=93....

September 25, 2019