Reading Notes: Understanding Linux Network Internal Parts 1-2

This blog has covered relatively new networking technologies, but it’s good to revisit the fundamentals. The Linux kernel will be used for a long time to come, and even without perfect understanding, having a grasp of the basics is meaningful. This book is over 1,000 pages and consists of parts 1-7, so maintaining motivation to read it all at once is difficult. In this article, I’ll summarize what I’ve read so far....

August 26, 2021

Building a Busybox-based Minimal Linux Environment and Booting with QEMU

As many people have already undertaken similar efforts and compiled them nicely into blog posts 1 2 3, I think being able to quickly create such an environment at hand is very meaningful. Here I’ll record the rough mechanism. I’ve compiled the results into scripts and uploaded them to Github. Features Supports building kernels for major distributions like CentOS6, CentOS7, Ubuntu20.04, enabling practical applications Uses Busybox to deploy userland in memory, creating a pure and minimal environment on each boot SSH login and external network connectivity are possible, making it easy to verify operations involving integration with other systems Can reference internal kernel data through debugging with GDB Currently only supports x86/64 Building the Kernel Building the kernel basically corresponds to writing the build configuration in a ....

July 7, 2021

Cloud Native Data Center Networking Reading Notes

I skimmed through only the parts that interested me. It’s well organized and a book I’d like to keep on hand. Throughout the book, the importance of KISS (Keep it simple, stupid) seems to be emphasized. Physical Aspects Don’t connect multiple links between Leaf and Spine. From a routing perspective, when a link fails, the same amount of traffic as before the failure will flow through the other live link. Since a certain link has failed, the expected bandwidth cannot be secured and performance degrades....

May 28, 2021

EVPN in the Data Center Reading Notes

I read EVPN in the Data Center, so I’m leaving some notes here. You can download the PDF for free from the NVIDIA page by registering your email address and other information. These are just personal notes during investigation, so they may contain errors. Introduction How should we deploy applications that assume L2 on an L3 network configured with a Clos topology? For example, applications that use L2 multicast or broadcast for health monitoring and member discovery fall into this category....

May 12, 2021

Running FRR (BGP Unnumbered) on Mininet

Introducing https://github.com/bobuhiro11/mininetlab. Using Mininet, you can run several switches and hosts on a single machine. I’ll use this to virtually launch two hosts and connect them via BGP (Unnumbered) included in the FRR package. In Mininet, you can describe the topology and command execution on each host in Python as follows. Although FRR may seem complex at first glance, if you properly place the three files daemons, vtysh.conf, and frr.conf, you can easily start it with frrinit....

May 8, 2021

Running BGP in Data Centers at Scale

I read Facebook’s paper 1 from NSDI 2021 and tried to summarize it in my own words without fear of mistakes. Overview This is a summary of Facebook’s BGP-based data center network. It covers practical content such as AS number allocation, route aggregation, BGP policies, testing and deployment methods for custom BGP implementations. The latter part also touches on actual incidents experienced over the past two years of operation. 1 Introduction Facebook operates several data centers, and these data centers have a common AS number allocation schema....

April 21, 2021

Use of BGP for Routing in Large-Scale Data Centers

I read RFC7938 so I’m taking notes. Since I only touched on it briefly, my terminology may be inappropriate. Overview Summarizing simple and highly stable network design methods for large-scale infrastructure with over 100,000 servers. BGP is adopted as the only routing protocol. Introduction I want to operate the infrastructure of large-scale distributed systems like web search engines with a simple and highly stable network with a small number of people....

April 13, 2021

Running OpenStack DevStack in a Container

Introduction to https://github.com/bobuhiro11/containerized-devstack. OpenStack has a convenient tool 1 for creating an all-in-one development environment with a single command, but it assumes VMs or physical machines as the target environment. If we could create that environment inside a container, it would make re-setup easier, which would be great. Of course, similar attempts 2 3 4 have been made several times in the past, but they are no longer maintained at this point, so I decided to create my own....

March 31, 2021

qcow2 Acceleration with Subcluster Introduction

qcow2 Structure Before discussing subclusters, let me briefly explain the data structure of qcow2. qcow2 data is composed of small blocks called clusters. When a guest reads or writes to a virtual disk, I/O is performed in cluster units on the backend qcow2 file. For example, in an environment where the cluster size is 64KB (QEMU’s default), if the guest reads or writes in 4KB units (a major block size), the I/O to the qcow2 file will be in 64KB units....

March 18, 2021

Building a VMM with KVM to Boot Linux - Development Log 2

2021/2/24 WSL2 Support 4f6b785 When running gokvm on Ubuntu 20.04 on WSL2 (Windows Subsystem for Linux 2), output to IO port 0x64 was repeated infinitely and didn’t reach the Init process startup. It seems the behavior around the PS/2 keyboard was the cause. In kvmtool, it returns 0x20 for in (0x61) 1, so I followed that approach. IO port 0x61 appears to be used as NMI (Non-Maskable Interrupt) status and control register 2....

March 3, 2021

Building a VMM with KVM to Boot Linux - Development Log

Introduction I created a naive and experimental VMM using KVM. It creates virtual machines by calling /dev/kvm through ioctl, and can boot the Linux Kernel and user processes on them. I also implemented a very simple serial console emulation that can be recognized by the kernel’s device driver, allowing operation from the login shell. On the other hand, networking and disks are not yet supported at this time. Recently, KVM has been used not only as a traditional virtual machine, but also to strengthen isolation levels in multi-tenant cloud environments, such as Google gVisor 1, Kata Containers 2, and Amazon Firecracker 3, for use in containers and micro VMs....

February 18, 2021

Linux Kernel Implementation of SRv6

What is SRv6 SRv6 is an extension of IPv6 that implements Source Routing. Source Routing means that the data sender specifies not only the destination but also the route. Nodes to pass through are identified by SIDs (Segment Identifiers), and the route can be freely controlled by including the list in the packet header. In SRv6, an IPv6 address corresponds to a SID. The specification of SRv6 is being developed mainly by IETF (Internet Engineering Task Force) 1....

January 17, 2021

mTLS (Mutual TLS) Notes

What is mTLS It’s called mutual TLS or TLS mutual authentication. I read a well-organized article 1, so I’m not confident I can express it accurately, but I’ll take notes in my own words. TLS is a protocol for encryption used when performing some kind of communication over a network. It’s used for web browsing, email, Voice over IP, etc. Especially in web browsing, it’s familiar as a lock icon displayed on the left side of the address bar....

January 12, 2021

RoCE v2 Notes

I looked into the operation of RoCE v2 1, a mechanism that realizes RDMA over Ethernet, within Microsoft. Introduction When it comes to RDMA, I had the image of Infiniband, but recently iWARP and RoCE are also candidates. RoCE stands for Remote Direct Memory Access over Converged Ethernet. Remote Direct Memory Access is a mechanism that can read and write the main memory of a remote node without going through the CPU....

November 25, 2020

QEMU/KVM on WSL2 Log

I was able to run virtual machines on a WSL2 guest on Windows 10 via /dev/kvm in a nested configuration. Environment Windows 10 Pro Insider Program (Dev Channel, OS Build 20246.1) 1 Guest on WSL2 Ubuntu 20.04.1 LTS (Focal Fossa) Linux 4.19.128-microsoft-standard Kernel parameters initrd=\initrd.img panic=-1 nr_cpus=4 swiotlb=force pty.legacy_count=0 QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.7) Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz Procedure By adding the following settings to the WSL2 (Windows Subsystem for Linux 2) global configuration C:\Users\username\....

November 4, 2020

vDPA (Virtio Data Path Acceleration) Notes

A method to achieve high-performance (NIC wirespeed) and flexible I/O in virtual machine and container environments. Not much Japanese information is available yet. I haven’t actually tried it, so there may be misunderstandings. vDPA Kernel Framework In March 2020, the vDPA kernel framework was merged into Linux 5.7. A vDPA device handled by the vDPA kernel framework refers to a device where the data plane follows the virtio specification and the control plane is vendor-specific....

October 27, 2020

Practical Rust Programming - Reading Notes

Chapter 1: Characteristics of Rust Rust was developed primarily by Mozilla, the developer of Firefox. It emphasizes safety, such as not allowing pointers to invalid memory regions. It also does not have a complex runtime like garbage collection. In Stack Overflow surveys, it ranked first for three consecutive years from 2016 to 2018 as the “most loved language.” LLVM is adopted as the compiler backend. On several environments including x86 Linux, static linking with external libraries is also supported....

October 19, 2020

VMware Complete Introduction - Reading Notes

Miscellaneous Notes vSphere environment interfaces GUI: vSphere Client, vSphere Web Client CLI: vSphere CLI, VMware vSphere Power CLI DCUI (Direct Console User Interface): ESXi console Virtual disks are created as a single file called .vmdk and stored in the datastore. For VLAN VST configuration, Port Group = VLAN configuration is common and recommended. vSphere DRS automatically performs vMotion for the purpose of load balancing between ESXi hosts. vMotion is so-called live migration....

October 15, 2020

XDP Notes (Architecture, Performance, Use Cases)

Introduction I read “The eXpress data path: fast programmable packet processing in the operating system kernel” 1. This article is mostly based on this paper, with some references to news articles. The popularity of eBPF/XDP can be felt from the GitHub star counts of projects using eBPF/XDP, such as BCC, bpftrace, Facebook Katran, and Cloudflare Gatebot. eBPF/XDP has a powerful advantage: it can achieve high-speed packet processing as a kernel mechanism without depending on special hardware or software....

September 17, 2020

Educational Codeforces 95 B - Negative Prefixes

Submission Code When swapping two elements $a_l$ and $a_r$, if $a_r > a_l$, the prefix sum increases in a certain range and remains unchanged elsewhere. In other words, you simply need to sort the unlocked elements in descending order. No change in prefix sum in the range $0 \sim l-1$ Prefix sum increases by $+(a_r-a_l)$ in the range $l \sim r-1$ No change in prefix sum in the range $r \sim n-1$

September 16, 2020