systemd-nspawn
UP | HOME

Setting up Systemd-nspawn

Table of Contents

systemd.png

1. Overview

In the good old days, when someone wanted to have some service on the internet and still be secure the way to "sandbox" a service was through chroot. Which basically left the service in a sandbox on its own with its own file system. So if someone hacked the service, all they could see was the chrooted environment, thus limiting the consequence of a hack or a malfunction. The problem on the other hand was the setup, all the shared libraries needed to be copied to the sandboxed root system and there were multiple device files that needed to be linked into the sandboxed file system! Maybe there were scripts and programs to help you out, but I remembered sitting for hours and trying to get the damn thing to work. Another problem was that there were well-known ways to get away from the sand-boxed environment. So someone invented the containers, which is basically running a virtual machine inside you're host machine. This of course simplified the whole thing. Now you had a complete running machine, in a local file system, the drawback was of course that it used way more storage space, but now days storage is cheap, so who cares. Its also possible to setup a minimal system which I guess wouldn't be bigger than a chrooted system. But the biggest benefit of a container is the process isolation, which means that a container gets its own namespace from the kernel, and the consequence of that is that the container can't read kernel memory or eat more RAM than allowed, and of course it can't read from other process memory space. Some container variants (e.g Docker using aufs) have a virtual file system, which means it will start clean every time you restart it. Enough with the overview, I'm pretty sure there are web sites that explain this better than me, and there are people that are way more knowledgeable than me in this field, so I'll leave it to the reader to get a better understanding.

This document will just describe on how to set up a new container using systemd-nspawn

2. Setting up container filesystem

First of all to be able to run systemd-nspawn the first thing you need to do is to setup the root filesystem in some part of youre host system. If we were to do this by our selfs it would take ages. Fortunatly there are scripts and programs that can do this for us e.g.

pacstrap
install packages to the specified new root directory.
debootstrap
Bootstrap a basic Debian system.
docker
"Stealing" a docker image filesystem.

There are are other ways of doing it, but these 3 simple enough for this tutorial.

2.1. pacstrap installation

So lets dig into to creating a new root system. The first thing we need to do is to install the pacstrap script. this is found in the arch-install-scripts package, and if you are running an arch Linux derivatives you can use pacman on youre host system.

pacman -S arch-install-scripts

First we need to install base package

Minimal package set to define a basic Arch Linux installation

You can search for packages to be installed on the Arch website. pacstrap needs to be executed with root privileges, so using sudo is necessary.

mkdir first
pacstrap ./first base base-devel vi > /dev/null

Ok thats it.. We have created our first virtual filesystem. Time to it up and eventually run it. Setting up and configure

2.2. From docker to nspawn

Another way of creating a root filesystem is "stealing" it from an existing docker container. lets do an example:

docker pull archlinux
Using default tag: latest
latest: Pulling from library/archlinux
Digest: sha256:e543fcbafadece75d0129ac04484b1cb2c36c18847c8609ae7634fe11c688651
Status: Image is up to date for archlinux:latest
docker.io/library/archlinux:latest

We need to set it up using create, the command creates a writable container layer over the specified image. The create

docker create --name MyArch archlinux /bin/bash
e483582066f3248956f34494b4cbbbf8a3f87456b3441abea545d370dd25368c

So maybe we want to store some info about the docker. This small script will just pull out some information and put it in a table.

docker container ls --all | awk '/MyArch/ {split($0,a,"\\s+{2}"); print a[1],",",a[2],",",a[4],",",a[6]}'
Container Id Image Id Created name
e483582066f3 archlinux 2 days ago MyArch

When this is done we need to export the docker. When exporting a docker the filesystem will be exported as a tar file. So instead of first creating a tar file we pipe it straight to tar and extract the filesystem as it is.

DIR="/tmp/MyArch"
NAME="MyArch"
mkdir ${DIR}
docker export ${NAME} | tar -x -C ${DIR}

Thats it, we have our filesystem. To be able to boot the arch-filesystem as a container and log in to it, we need to set the password and remove securetty file, this is easily done

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
rm -f ${ARCH_DIR}/etc/securetty
systemd-nspawn -D ${ARCH_DIR} passwd

Now we are all setup to start our container.

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
sudo systemd-nspawn -bD ${ARCH_DIR}

This should boot up the system, and you should be able to login using the password entered before.

In this case we are using arch-linux, I would suggest upgrading and installing packages that will be handy.

Running inside the container after login:

pacman -Syu
.
.
pacman -S vi iproute2
.
.
.

This should enable us to edit files and check the networking.

3. Setting up

The setting up is pretty simple, though somewhat crucuial to be able to use the new root system. There are two steps that needs to be fixed before we do anything.

root password
change to something..
securetty
need to add our tty and declare it as secure.

3.1. Setting the root password

To be able to create a root password we need to execute passwd command inside the container. We can do this by executing the command

systemd-nspawn -D ./first passwd

the passwd binary is just one of all the commands that you can execute on the container. A great example is if you want to cross-compile something. It is then possible to set up a container with all the necessary compilers,linkers and libraries, and then just mount up your own filesystem in the container and run for example cmake with some options that makes it use cross-compiler instead.

3.2. Editing securetty

The file /etc/securetty contains the list of the terminals which are considered secure, i.e. where root can log in. Since we are running on a host machine , we can either just add our pseudoterminal or just simply remove the file.

echo $PASSWD | sudo -S systemd-nspawn -D ./first  cat /etc/securetty | gawk '/^[a-zA-Z]/ {print $1}'
console
tty1
tty2
tty3
tty4
tty5
tty6
ttyS0
hvc0
pts/0

Either you add your own pty or simply remove the /etc/securetty file.

3.3. Make it boot

Lets see if we can boot the filesystem. This is our first try , so we will run it in a host networking mode, more is explained in networking, but essentially in this mode the container has full access to the hosts network, any packets sent from the container looks as if its origin from the host machine. In other words there is not distinction between the host and the container packets departing on the network.

Lets start it up:

sudo systemd-nspawn -bD ./first
.
.
.
[  OK  ] Started Login Service.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.

Manjaro Linux 5.4.58-1-MANJARO  (manjaro) (console)
manjaro login: root
Password:
....
[root@manjaro ~]#

Now we have a complete system setup and running. The password was setup before (see password), and the securetty file has been edited. which means it should be possible to login as root. Generally speaking you shouldn't login as root on a system, though at this point we haven't actually connected anything to outside world yet. But the goal is to have a private container, which basically means that it will looks as if it is connected a freestanding hardware. That means that we will add the adding securetty file again and adding user

3.4. Adding user

Adding user is quite rudimentary, so no further explanation is done here. But before we can add a new users we need to install sudo. For now we just add the whole group base-devel, which will give us some other binaries too, but they might come in handy later on, and space is not an issue right now. I will however add a new group , this so I can allow everyone in that group to be allowed to sudo.

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
systemd-nspawn -D ${ARCH_DIR} pacman -S  --noconfirm base-devel

Time to add new group, this group is setup so that all member in that group is allowed to do sudo, we also need to instruct sudo that this group is allowed to do sudo. This is done through adding a new file to /eyc/sudoers.d/ with the syntax described in sudoers.d.

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
systemd-nspawn -D ${ARCH_DIR} groupadd -g 666 supers
systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"%supers ALL=(ALL) NOPASSWD:ALL\" > /etc/sudoers.d/supers"

finally we add a new user, we also make sure that the user is added to the supers group which allows it to use sudo.

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
systemd-nspawn -D ${ARCH_DIR} useradd -m -G supers calle
systemd-nspawn -D ${ARCH_DIR} bash -c "echo -e \"secret\nsecret\" | passwd calle"

All done! a new user which is allowed to use sudo is setup. We can now choose to add the securetty file again. i.e.

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
systemd-nspawn -D ${ARCH_DIR} touch /etc/securetty

This will prevent anyone from trying to login as root and since we have a user which can use sudo, this is not an issue.

3.5. Misc

Here are some usful setup for the container.

3.5.1. locale

The locale is used by other programs and libraries to render text correctly it also set time,date format aso.

Lets dig into it,

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
  systemd-nspawn -D ${ARCH_DIR} \
                 sed -E -i 's|#(sv_SE ISO-8859-1+)|\1|g' /etc/locale.gen

  systemd-nspawn -D ${ARCH_DIR} \
                 sed -E -i 's|#(en_US.UTF-8 UTF-8)|\1|g' /etc/locale.gen

  systemd-nspawn -D ${ARCH_DIR} locale-gen
Generating locales…
en_US.UTF-8… done
sv_SE.ISO-8859-1… done
Generation complete.

That means that we installed the different locales on our system. We can now set the default for all user by editing /etc/locale.conf. We can also edit for an explicit users, this is done by setting the ~<user>/.config/locale.conf i.e.

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
HOME_DIR=$(systemd-nspawn -D ${ARCH_DIR} lslogins calle | awk '/Home directory:/ {print $3}')
systemd-nspawn -D ${ARCH_DIR} bash -c "echo "LANG=en_US.utf8" > /etc/locale.conf"

systemd-nspawn -D ${ARCH_DIR} bash -c "echo "LANG=sv_SE.ISO-8859-1" > /home/calle/.config/locale.conf"
systemd-nspawn -D ${ARCH_DIR} bash -c "echo "LC_MESSAGES=sv_SE.ISO-8859-1" >> /home/calle/.config/locale.conf"

this can of course be changed when logging in to the container i.e. with localectl Anyhow, I will leave to the user to read more about locale on arch-wiki page.

3.5.2. hostname

Changing of the container is a good thing . That means we need to change the file /etc/hostname but we also need to add it to the static lookup table /etc/hosts. In my case I will call the container "rovgadda".

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
HOSTNAME="rovgadda"
systemd-nspawn -D ${ARCH_DIR} bash -c "echo ${HOSTNAME} > /etc/hostname"
systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"127.0.0.1   ${HOSTNAME}\" > /etc/hosts"
systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"127.0.0.1   localhost\" >> /etc/hosts"
systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"::1   ${HOSTNAME}\" >> /etc/hosts"
systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"::1   localhost\" >> /etc/hosts"
systemd-nspawn -D ${ARCH_DIR} cat /etc/hosts
127.0.0.1 rovgadda
127.0.0.1 localhost
::1 rovgadda
::1 localhost

3.6. Networking

We start with the host we need to checkout what network devices we currently have to our disposal.

DEVS=$(ip add | gawk 'match($0,/^[0-9]: (.*):/,a) {print a[1]}')

for dev in $DEVS; do
    IP=$(ip add show dev $dev | gawk '/inet / {split($0,b," "); print b[2]}')
    echo "$dev $IP"
done
lo 127.0.0.1/8
enp3s0 192.168.1.9/24

The lo is the loopback interface, which we can't use for our container. That leaves us with enp3s0 which seems like a good candidate.

Lets dig into the realm of container networking. For the systemd-nspawn containers there are two states in which a container can take. Here is a short explanation:

Host networking
In this mode the container has full access to the hosts network. This means that all the packets on the network will look as if they are coming from the host. No sepparation are done. One can see this as if the container itself is just an application on the host, they share the same ip and packets to services can be retreived by the host.
Private networking
Contrary to the host networking a private networking method is more like a sepparate machine. Any packets leaving the container will have its packet-source different from the actual host machine, even if the underlying hardware is the same. This is achieved through sofware, and there are many different ways of achiving this and on different layers in the tcp/ip model. I will not do a complete cover of all the different ways of achiving this, but merely scratch on the surface and get something that works that achives the goal of having a private network.

3.6.1. Virtual interfaces

To be able to let the container has its own interface, even though we don't have the hardware, we need to create a virtual network. The VETH device is a virtual ethernet device, well, actually its a pair of virtual devices, what ever is sent on one device will end up immediatly on the other. The theory is to make a pair of VETH and let the container have one side and the host the other.

Why pair? A device by itself is of no use, the device needs to be connected with a cord or some other medium to another device to be able to communicate. One can imagine VETH pairs as two network cards connected with a cord between them, this is of course just virtual, and in fact we will be able to connect one of the endpoints to i.e. bridge, which will be discussed later. But for now its just 2 network cards connected with a cord.

+----------+                         +------------+
| veth1    |         virt            |  veth2     |
|  (e1)    +<----------------------->+   (e2)     |
+----------+         cord            +------------+

Its quite easy to construct a veth pair using iproute2. A veth can also be used to connect two different network namespaces. What on earth is that, you might wonder? Ohh, just leave it. I will leave it to the reader to find more about it i.e here, but basically it means the each namespace is sepparated from each other.

Lets continue without explaining to much about the namespaces, and will stick to the global namespace for now. The following command will create a veth pair

ip link add e1 type veth peer name e2
ip link show e1
ip link show e2
5: e1@e2: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 5a:ee:84:88:bd:a0 brd ff:ff:ff:ff:ff:ff
4: e2@e1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether ca:a7:0f:d0:31:38 brd ff:ff:ff:ff:ff:ff

We have now created a pair of devices, though the devices are not connected to anything except between them selves. So lets first start the container providing one of our newly created devices (e2).

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
systemd-nspawn --network-interface=e2 -bD ${ARCH_DIR}

In the container we can now check that we have a new device. with ip link, #+begin_example $ ip link show e2 4: e2@if3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 82:42:1c:03:2c:2c brd ff:ff:ff:ff:ff:ff link-netnsid 0 #+end_example>

If we now continue working inside the container and setup ip address for our new interface.

ip addr add 10.0.0.1/24 dev e2

Back to the host-machine, where we have the other end of the connection, we continue to add a new ip address.

ip addr add 10.0.0.2/24 dev e1

Finally we need to bring the device up on both sides

ip link set up dev e1

This is the conceptual view of the host machine.

+--------------------------------------------+
| Host                                       |
|                    +--------------------+  |
|                    |      Container     |  |
|                    |                    |  |
|  e1 ---------------+--- e2              |  |
|  10.0.0.2          |   10.0.0.1         |  |
|                    |                    |  |
|                    +--------------------+  |
|                                            |
+--------------------------------------------+

We can now connect from the host to the container, sweet.. This is somewhat a private connection between the host and the container. Its not connected to anything other themselves.

We could create another container and connect the two container through veth and so on. But there is no connection to the outside world, and what if we want to connect more devices? let us continue the path with bridges.

3.6.2. Bridges

A bridge is a virtual switch which connects networking hardware together. The switch in it self has a lookup table connecting mac-address with a physical-port (though in this case its virtual-physical?!?), that means it forwards the packets in the data link layer. There are also switches that work on network layer (layer 3), but lets not delve into more details then necessary.

switch.png Switch with lookup-table

So lets first create a "virtual" switch. I'll call it "br0".

BR="br0"
ip link add name ${BR} type bridge
ip link set br0 up
ip link show br0
3: br0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether e2:5d:ad:d2:d4:09 brd ff:ff:ff:ff:ff:ff

this line is the actual command to create the bridge.

Thats it, we just created a switch, though we haven't really connected anything to it. Since its not a real physical switch, for obvious reasons we cannot connect cords to it. But we can connect devices. Lets do that.

We start by removing the old veth interfaces that we used before by deleting one of the interfaces. I should probably mention that I shutdown the container.

ip link delete e1

Now lets create a new veth by issuing

ip link add q1 type veth peer name q2
ip link set up dev q1
ip link set up dev q2
ip link show q1
ip link show q2
11: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 06:7e:60:58:f2:f3 brd ff:ff:ff:ff:ff:ff
10: q2@q1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 22:84:e8:48:b7:ba brd ff:ff:ff:ff:ff:ff

One thing to notice i that the mac-address is different from all the interfaces so far, that is because veth will actuall create its own mac-address which is sepparate from all the others, and that is important since a bridge is using the macaddress to link to a physical-port, or in this case virtual-physical-port. We also brought up the interfaces.

So far we haven't brought up the bridge yet, and the reason for that is as long as there are no connecting devices to it, it will be down, that is by design.

Lets now connect one of the veth devices, lets take q1.

ip link set q1 master br0
bridge link
11: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 2

Voila! We have our first connection, as the following conceptual figure shows.

+-------------------------------------------------+
|    +------+-----+                        Host   |
|    |      | q1  | br0                           |
|    +------+--+--+                               |
|              |                                  |
|              |                                  |
|              +---------- q2                     |
|                                                 |
|                                                 |
+-------------------------------------------------+

One thing to note is that as soon as we connected q1 to the bridge, the state changed to UP

ip link | grep br0:
11: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000

This by itself is not that much fun, so lets create another veth, lets call them e1,e2 and lets connect e1 to the bridge.

ip link add e1 type veth peer name e2
ip link set up dev e1
ip link set up dev e2
ip link set e1 master br0
bridge link
11: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 2
13: e1@e2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 2
  • First we create the veth pair.
  • Bring up the interfaces e1 e2.
  • Add e1 to the bridge.

If we let one of the q-devices(q2, since the q1 is connected to the brige) go into the container as a device.

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
systemd-nspawn --network-interface=q2 -bD ${ARCH_DIR}

In the container we set the q2 interface to 10.0.0.2 using ip addr add 10.0.0.2/24 dev q2 command.

And on the host side we set the e2 interface to 10.0.0.1 using ip addr add 10.0.0.1/24 dev e2 command.

we get the following picture.

+------------------------------------------------------------+
|                  switch                   container        |
|                 +---+---+            +----------------+    |
|    e2-----------+e1 | q1+-----+      |                |    |
| 10.0.0.2        +---+---+     +------+---- q2         |    |
|                                      |  10.0.0.1      |    |
|                                      +----------------+    |
|                                                            |
|                                                            |
+------------------------------------------------------------+

If we now is inside the container we can ping the the e2 device (ping 10.0.0.2). And using the e2 device we can ping the container (ping 10.0.0.1) . We could, use the e2 device and connect that to another container, and now we have a connection between the two containers, why stop there we could create any number of containers and connect one side of the veth to the container and the other to the bridge, making a starshaped mesh network. And another nice feature is that we could use i.e. wireshark sniff the packets between e2 and q2 using the br0 as an interface. And why not adding another bridge and connect the two bridges together and maybe even add vlan tags (see vlan description), there are endless posibilities.

3.6.2.1. Routing

Lets first checkout what routing we have inside the container.

systemd-run --machine=first -t  /usr/bin/ip route

The result might not be so suprising.

10.0.0.0/24 dev q2 proto kernel scope link src 10.0.0.1 

What we learned here is that anything going to the network 10.0.0.??? is going via 10.0.0.2 which has been assigned to e1.

To be consequent, lets checkout the route from the host side

ip route get 10.0.0.1 oif e2

The above command will show you the route for getting to network address 10.0.0.1 using the outpt device e2. This obtains the not so surprising result:

10.0.0.1 dev e2 src 10.0.0.2 uid 0
    cache

One problem that might occure is when setting the ip address of an interface, is forgetting the netmask on a interface. i.e

ip addr add 10.0.0.2 dev e2
ip route
default via 192.168.1.1 dev enp3s0 proto static
192.168.1.0/24 dev enp3s0 proto kernel scope link src 192.168.1.9

The problem here is that the command will by default add a /32 as netmask. Add the consequence of that is that no route will be added for 10.0.0.??? network and is therefor not available. So be sure to add th /24 or what ever netmask you are using otherwise you might end up with spending hours of trying to figure out why its not working. i.e. ip addr add 10.0.0.2/24 dev e2.

3.6.3. Outside connection using physical interface

So we have our basic understanding of the bridge and setting up virtual ethernet devices. Though there are one thing that has been bothering us for a while. Our container is just sitting there without a connection to the outside world. In this case we will connect a real physical device to out bridge.

As the following overview picture shows.

+-------------------------------------------------+
|                                                 |
|   br0                 +----------------+        |
|  192.168.1.9          |                |        |
| +-------+------+      |                |        |
| |enp3s0 |   q2 +------+-- q2  DHCP     |        |
| +---+---+------+      |   192.168.1.?  |        |
|     |                 +----------------+        |
|     |                                           |
|     |                                           |
+-----|-------------------------------------------+
      |
      |
      +

Lets start from scratch and set up the veth

echo "----------Q-devices----------"
ip link add q1 type veth peer name q2
ip link set up dev q1
ip link set up dev q2
ip link show q1
ip link show q2
ip link set q1 master br0
echo "----------Bridge connections----------"
bridge link
----------Q-devices----------
29: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 06:7e:60:58:f2:f3 brd ff:ff:ff:ff:ff:ff
28: q2@q1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 22:84:e8:48:b7:ba brd ff:ff:ff:ff:ff:ff
----------Bridge connections----------
29: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 2

Lets start the container with q2 as interface.

ARCH_NAME="first"
ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}"
systemd-nspawn --network-interface=q2 -bD ${ARCH_DIR}

Its now time to connect our enp3s0 device which is a physical device. If we just connect the interface enp3s0 to the bridge br0 with the command ip link set enp3s0 master br0 we loose the connection altogether, not even local network is available. Why? In this case I'm not sure , I will update this if I figure it out. Instead we remove the ip address from the interface

ip addr del 192.168.1.9 dev enp3s0

Then we attach the interface to the Bridge by doing.

ip link set enp3s0 master br0
ip addr add 192.168.1.9/24 dev br0
ip route add default via 192.168.1.1
  • Add to master , adds the physical interface to bridge.
  • Set bridge ip to the same as was assigned to the physical interface.
  • Finally, set the default (not necessary if the route already exists).

We have one more thing todo, and that is set the Ip address of the container. We now have the choice of setting the ip address in the interface as usual, or we could even use dhcp, i.e.

systemd-run --machine=first -t  /usr/bin/dhclient q2
systemd-run --machine=first -t  /usr/bin/ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
28: q2@if29: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 22:84:e8:48:b7:ba brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.1.190/24 brd 192.168.1.255 scope global q2
       valid_lft forever preferred_lft forever
    inet6 fe80::2084:e8ff:fe48:b7ba/64 scope link 
       valid_lft forever preferred_lft forever

The container is now able to connect to both the bridge on the host and its connected entities and the surrounding network or even Internet. That also means that the outside network can connect to the container, i.e. running a webserver. To add another container is just a few step behind.

3.6.4. Other methods

You might ask yourself if there an easier way of doing this. And of course there is, I just wanted the background knowledge before.

Now that we have a bridge br0 we could just supply that as an argument when starting our container. i.e.

systemd-nspawn --network-bridge=br0 -bD /mnt/data/calle/rootfs/first

This command is exactly the same as creating a veth and a bridge that we created in the Bridge section, and provide the container with one side, and connect the other to the bridge. Except it is much easier.

Another way of achiving connectivity is to use a macvlan interface (see macvlan). This is much eassier than the process of setting up a bridge. This also gives the container full access to the network. And in bridge mode the macvlan behaves exactly like what is explained above. In the following example I will use the bridge as an underlying network interface (just to show that it works).

systemd-nspawn --network-macvlan=br0 -bD /mnt/data/calle/rootfs/first

But usually this is done with a real physical interface. A macvlan interface is a virtual interface that adds a second MAC address tp an existing physical ethernet link.

You might wonder what if I want to add additional veth to my container.

lets consider this picture.

+---------------------------------------------------+
|       +---------------------------------+         |
| +-----+-------+                +--------+-----+   |
| |     |       |     Bridge     |        |     |   |
| |     |       | +-----+-----+  |        |     |   |
| |     e1   q1-+-+ q0  |  p0 +--+-p1    e2     |   |
| |             | |     |     |  |              |   |
| |             | +-----+-----+  |              |   |
| | container 2 |                | container 1  |   |
| +-------------+                +--------------+   |
|                                                   |
+----------------------host-------------------------+

Where each of the container has a connection to a bridge, but it also has a private connection (veth) between the containers. The following commmand would work:

systemd-nspawn --network-bridge=br0 --network-veth-extra=e1:e2 -bD first
systemd-nspawn --network-bridge=br0 --network-interface=e2 -bD second
Container 1
Adds a connection between the bridge and container via host0 interface (picture is called p1,p0), it also create an extra veth pair (e1/e2) where one(e1) is added to the container.
Container 2
Does the same connection to the bridge (q1,q0) but instead of creating a new veth pair it provides the interface through network-interface

4. Links

Author: Calle Olsen

Created: 2022-06-11 Sat 19:33