Setting up Systemd-nspawn
Table of Contents
1. Overview
In the good old days, when someone wanted to have some service on the internet and still be secure the way to "sandbox" a service was through chroot. Which basically left the service in a sandbox on its own with its own file system. So if someone hacked the service, all they could see was the chrooted environment, thus limiting the consequence of a hack or a malfunction. The problem on the other hand was the setup, all the shared libraries needed to be copied to the sandboxed root system and there were multiple device files that needed to be linked into the sandboxed file system! Maybe there were scripts and programs to help you out, but I remembered sitting for hours and trying to get the damn thing to work. Another problem was that there were well-known ways to get away from the sand-boxed environment. So someone invented the containers, which is basically running a virtual machine inside you're host machine. This of course simplified the whole thing. Now you had a complete running machine, in a local file system, the drawback was of course that it used way more storage space, but now days storage is cheap, so who cares. Its also possible to setup a minimal system which I guess wouldn't be bigger than a chrooted system. But the biggest benefit of a container is the process isolation, which means that a container gets its own namespace from the kernel, and the consequence of that is that the container can't read kernel memory or eat more RAM than allowed, and of course it can't read from other process memory space. Some container variants (e.g Docker using aufs) have a virtual file system, which means it will start clean every time you restart it. Enough with the overview, I'm pretty sure there are web sites that explain this better than me, and there are people that are way more knowledgeable than me in this field, so I'll leave it to the reader to get a better understanding.
This document will just describe on how to set up a new container
using systemd-nspawn
2. Setting up container filesystem
First of all to be able to run systemd-nspawn
the first thing you
need to do is to setup the root filesystem in some part of youre
host system. If we were to do this by our selfs it would take
ages. Fortunatly there are scripts and programs that can do this for
us e.g.
- pacstrap
- install packages to the specified new root directory.
- debootstrap
- Bootstrap a basic Debian system.
- docker
- "Stealing" a docker image filesystem.
There are are other ways of doing it, but these 3 simple enough for this tutorial.
2.1. pacstrap installation
So lets dig into to creating a new root system.
The first thing we need to do is to install the pacstrap
script.
this is found in the arch-install-scripts package, and if you are running
an arch Linux derivatives you can use pacman
on youre host system.
pacman -S arch-install-scripts
First we need to install base
package
Minimal package set to define a basic Arch Linux installation
You can search for packages to be installed on the Arch website.
pacstrap
needs to be executed with root privileges, so using
sudo is necessary.
mkdir first pacstrap ./first base base-devel vi > /dev/null
Ok thats it.. We have created our first virtual filesystem. Time to it up and eventually run it. Setting up and configure
2.2. From docker to nspawn
Another way of creating a root filesystem is "stealing" it from an existing docker container. lets do an example:
docker pull archlinux
Using default tag: latest latest: Pulling from library/archlinux Digest: sha256:e543fcbafadece75d0129ac04484b1cb2c36c18847c8609ae7634fe11c688651 Status: Image is up to date for archlinux:latest docker.io/library/archlinux:latest
We need to set it up using create
, the command creates a writable
container layer over the specified image.
The create
docker create --name MyArch archlinux /bin/bash
e483582066f3248956f34494b4cbbbf8a3f87456b3441abea545d370dd25368c |
So maybe we want to store some info about the docker. This small script will just pull out some information and put it in a table.
docker container ls --all | awk '/MyArch/ {split($0,a,"\\s+{2}"); print a[1],",",a[2],",",a[4],",",a[6]}'
Container Id | Image Id | Created | name |
---|---|---|---|
e483582066f3 | archlinux | 2 days ago | MyArch |
When this is done we need to export the docker.
When exporting a docker the filesystem will be exported as a tar file.
So instead of first creating a tar file we pipe it straight to tar
and extract
the filesystem as it is.
DIR="/tmp/MyArch" NAME="MyArch" mkdir ${DIR} docker export ${NAME} | tar -x -C ${DIR}
Thats it, we have our filesystem. To be able to boot the
arch-filesystem as a container and log in to it, we need to set the password
and remove securetty
file, this is easily done
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" rm -f ${ARCH_DIR}/etc/securetty systemd-nspawn -D ${ARCH_DIR} passwd
Now we are all setup to start our container.
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" sudo systemd-nspawn -bD ${ARCH_DIR}
This should boot up the system, and you should be able to login using the password entered before.
In this case we are using arch-linux, I would suggest upgrading and installing packages that will be handy.
Running inside the container after login:
pacman -Syu . . pacman -S vi iproute2 . . .
This should enable us to edit files and check the networking.
3. Setting up
The setting up is pretty simple, though somewhat crucuial to be able to use the new root system. There are two steps that needs to be fixed before we do anything.
- root password
- change to something..
- securetty
- need to add our tty and declare it as secure.
3.1. Setting the root password
To be able to create a root password we need to execute passwd
command inside
the container. We can do this by executing the command
systemd-nspawn -D ./first passwd
the passwd
binary is just one of all the commands that you
can execute on the container. A great example is if you want to
cross-compile something. It is then possible to set up a container
with all the necessary compilers,linkers and libraries, and then just mount up
your own filesystem in the container and run for example cmake
with
some options that makes it use cross-compiler instead.
3.2. Editing securetty
The file /etc/securetty
contains the list of the terminals which
are considered secure, i.e. where root can log in. Since we are running on a host machine , we can either
just add our pseudoterminal or just simply remove the file.
echo $PASSWD | sudo -S systemd-nspawn -D ./first cat /etc/securetty | gawk '/^[a-zA-Z]/ {print $1}'
console |
tty1 |
tty2 |
tty3 |
tty4 |
tty5 |
tty6 |
ttyS0 |
hvc0 |
pts/0 |
Either you add your own pty or simply remove the /etc/securetty
file.
3.3. Make it boot
Lets see if we can boot the filesystem. This is our first try , so we will run it in a host networking mode, more is explained in networking, but essentially in this mode the container has full access to the hosts network, any packets sent from the container looks as if its origin from the host machine. In other words there is not distinction between the host and the container packets departing on the network.
Lets start it up:
sudo systemd-nspawn -bD ./first
. . . [ OK ] Started Login Service. [ OK ] Reached target Multi-User System. [ OK ] Reached target Graphical Interface. Manjaro Linux 5.4.58-1-MANJARO (manjaro) (console) manjaro login: root Password: .... [root@manjaro ~]#
Now we have a complete system setup and running. The password was
setup before (see password), and the securetty
file has been edited.
which means it should be possible to login as root. Generally
speaking you shouldn't login as root on a system, though at this point
we haven't actually connected anything to outside world yet. But the
goal is to have a private container, which basically means that it
will looks as if it is connected a freestanding hardware. That means
that we will add the adding securetty file again and adding user
3.4. Adding user
Adding user is quite rudimentary, so no further explanation is done here. But before we can add a new users we need to install sudo. For now we just add the whole group base-devel, which will give us some other binaries too, but they might come in handy later on, and space is not an issue right now. I will however add a new group , this so I can allow everyone in that group to be allowed to sudo.
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" systemd-nspawn -D ${ARCH_DIR} pacman -S --noconfirm base-devel
Time to add new group, this group is setup so that all member in
that group is allowed to do sudo, we also need to
instruct sudo that this group is allowed to do sudo. This is done through
adding a new file to /eyc/sudoers.d/
with the syntax described in sudoers.d.
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" systemd-nspawn -D ${ARCH_DIR} groupadd -g 666 supers systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"%supers ALL=(ALL) NOPASSWD:ALL\" > /etc/sudoers.d/supers"
finally we add a new user, we also make sure that the user is added to the supers group which allows it to use sudo.
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" systemd-nspawn -D ${ARCH_DIR} useradd -m -G supers calle systemd-nspawn -D ${ARCH_DIR} bash -c "echo -e \"secret\nsecret\" | passwd calle"
All done! a new user which is allowed to use sudo is setup.
We can now choose to add the securetty
file again.
i.e.
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" systemd-nspawn -D ${ARCH_DIR} touch /etc/securetty
This will prevent anyone from trying to login as root and since we have a user which can use sudo, this is not an issue.
3.5. Misc
Here are some usful setup for the container.
3.5.1. locale
The locale is used by other programs and libraries to render text correctly it also set time,date format aso.
Lets dig into it,
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" systemd-nspawn -D ${ARCH_DIR} \ sed -E -i 's|#(sv_SE ISO-8859-1+)|\1|g' /etc/locale.gen systemd-nspawn -D ${ARCH_DIR} \ sed -E -i 's|#(en_US.UTF-8 UTF-8)|\1|g' /etc/locale.gen systemd-nspawn -D ${ARCH_DIR} locale-gen
Generating | locales… |
en_US.UTF-8… | done |
sv_SE.ISO-8859-1… | done |
Generation | complete. |
That means that we installed the different locales on our system.
We can now set the default for all user by editing /etc/locale.conf
.
We can also edit for an explicit users, this is done by setting the
~<user>/.config/locale.conf
i.e.
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" HOME_DIR=$(systemd-nspawn -D ${ARCH_DIR} lslogins calle | awk '/Home directory:/ {print $3}') systemd-nspawn -D ${ARCH_DIR} bash -c "echo "LANG=en_US.utf8" > /etc/locale.conf" systemd-nspawn -D ${ARCH_DIR} bash -c "echo "LANG=sv_SE.ISO-8859-1" > /home/calle/.config/locale.conf" systemd-nspawn -D ${ARCH_DIR} bash -c "echo "LC_MESSAGES=sv_SE.ISO-8859-1" >> /home/calle/.config/locale.conf"
this can of course be changed when logging in to the container i.e. with localectl Anyhow, I will leave to the user to read more about locale on arch-wiki page.
3.5.2. hostname
Changing of the container is a good thing . That means we need to change
the file /etc/hostname
but we also need to add it to the static lookup table
/etc/hosts
. In my case I will call the container "rovgadda".
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" HOSTNAME="rovgadda" systemd-nspawn -D ${ARCH_DIR} bash -c "echo ${HOSTNAME} > /etc/hostname" systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"127.0.0.1 ${HOSTNAME}\" > /etc/hosts" systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"127.0.0.1 localhost\" >> /etc/hosts" systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"::1 ${HOSTNAME}\" >> /etc/hosts" systemd-nspawn -D ${ARCH_DIR} bash -c "echo \"::1 localhost\" >> /etc/hosts" systemd-nspawn -D ${ARCH_DIR} cat /etc/hosts
127.0.0.1 | rovgadda |
127.0.0.1 | localhost |
::1 | rovgadda |
::1 | localhost |
3.6. Networking
We start with the host we need to checkout what network devices we currently have to our disposal.
DEVS=$(ip add | gawk 'match($0,/^[0-9]: (.*):/,a) {print a[1]}') for dev in $DEVS; do IP=$(ip add show dev $dev | gawk '/inet / {split($0,b," "); print b[2]}') echo "$dev $IP" done
lo | 127.0.0.1/8 |
enp3s0 | 192.168.1.9/24 |
The lo
is the loopback interface, which we can't use for our container.
That leaves us with enp3s0
which seems like a good candidate.
Lets dig into the realm of container networking. For the systemd-nspawn containers there are two states in which a container can take. Here is a short explanation:
- Host networking
- In this mode the container has full access to the hosts network. This means that all the packets on the network will look as if they are coming from the host. No sepparation are done. One can see this as if the container itself is just an application on the host, they share the same ip and packets to services can be retreived by the host.
- Private networking
- Contrary to the host networking a private networking method is more like a sepparate machine. Any packets leaving the container will have its packet-source different from the actual host machine, even if the underlying hardware is the same. This is achieved through sofware, and there are many different ways of achiving this and on different layers in the tcp/ip model. I will not do a complete cover of all the different ways of achiving this, but merely scratch on the surface and get something that works that achives the goal of having a private network.
3.6.1. Virtual interfaces
To be able to let the container has its own interface, even though we don't have the hardware, we need to create a virtual network. The VETH device is a virtual ethernet device, well, actually its a pair of virtual devices, what ever is sent on one device will end up immediatly on the other. The theory is to make a pair of VETH and let the container have one side and the host the other.
Why pair? A device by itself is of no use, the device needs to be connected with a cord or some other medium to another device to be able to communicate. One can imagine VETH pairs as two network cards connected with a cord between them, this is of course just virtual, and in fact we will be able to connect one of the endpoints to i.e. bridge, which will be discussed later. But for now its just 2 network cards connected with a cord.
+----------+ +------------+ | veth1 | virt | veth2 | | (e1) +<----------------------->+ (e2) | +----------+ cord +------------+
Its quite easy to construct a veth pair using iproute2. A veth can also be used to connect two different network namespaces. What on earth is that, you might wonder? Ohh, just leave it. I will leave it to the reader to find more about it i.e here, but basically it means the each namespace is sepparated from each other.
Lets continue without explaining to much about the namespaces, and will stick to the global namespace for now. The following command will create a veth pair
ip link add e1 type veth peer name e2 ip link show e1 ip link show e2
5: e1@e2: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 5a:ee:84:88:bd:a0 brd ff:ff:ff:ff:ff:ff 4: e2@e1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ca:a7:0f:d0:31:38 brd ff:ff:ff:ff:ff:ff
We have now created a pair of devices, though the devices are not connected to anything except between them selves. So lets first start the container providing one of our newly created devices (e2).
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" systemd-nspawn --network-interface=e2 -bD ${ARCH_DIR}
In the container we can now check that we have a new device.
with ip link
,
#+begin_example
$ ip link show e2
4: e2@if3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 82:42:1c:03:2c:2c brd ff:ff:ff:ff:ff:ff link-netnsid 0
#+end_example>
If we now continue working inside the container and setup ip address for our new interface.
ip addr add 10.0.0.1/24 dev e2
Back to the host-machine, where we have the other end of the connection, we continue to add a new ip address.
ip addr add 10.0.0.2/24 dev e1
Finally we need to bring the device up on both sides
ip link set up dev e1
This is the conceptual view of the host machine.
+--------------------------------------------+ | Host | | +--------------------+ | | | Container | | | | | | | e1 ---------------+--- e2 | | | 10.0.0.2 | 10.0.0.1 | | | | | | | +--------------------+ | | | +--------------------------------------------+
We can now connect from the host to the container, sweet.. This is somewhat a private connection between the host and the container. Its not connected to anything other themselves.
We could create another container and connect the two container through veth and so on. But there is no connection to the outside world, and what if we want to connect more devices? let us continue the path with bridges.
3.6.2. Bridges
A bridge is a virtual switch which connects networking hardware together. The switch in it self has a lookup table connecting mac-address with a physical-port (though in this case its virtual-physical?!?), that means it forwards the packets in the data link layer. There are also switches that work on network layer (layer 3), but lets not delve into more details then necessary.
Switch with lookup-table
So lets first create a "virtual" switch. I'll call it "br0".
BR="br0" ip link add name ${BR} type bridge ip link set br0 up ip link show br0
3: br0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/ether e2:5d:ad:d2:d4:09 brd ff:ff:ff:ff:ff:ff
this line is the actual command to create the bridge.
Thats it, we just created a switch, though we haven't really connected anything to it. Since its not a real physical switch, for obvious reasons we cannot connect cords to it. But we can connect devices. Lets do that.
We start by removing the old veth interfaces that we used before by deleting one of the interfaces. I should probably mention that I shutdown the container.
ip link delete e1
Now lets create a new veth by issuing
ip link add q1 type veth peer name q2 ip link set up dev q1 ip link set up dev q2 ip link show q1 ip link show q2
11: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 06:7e:60:58:f2:f3 brd ff:ff:ff:ff:ff:ff 10: q2@q1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 22:84:e8:48:b7:ba brd ff:ff:ff:ff:ff:ff
One thing to notice i that the mac-address is different from all the interfaces so far, that is because veth will actuall create its own mac-address which is sepparate from all the others, and that is important since a bridge is using the macaddress to link to a physical-port, or in this case virtual-physical-port. We also brought up the interfaces.
So far we haven't brought up the bridge yet, and the reason for that is as long as there are no connecting devices to it, it will be down, that is by design.
Lets now connect one of the veth devices, lets take q1.
ip link set q1 master br0 bridge link
11: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 2
Voila! We have our first connection, as the following conceptual figure shows.
+-------------------------------------------------+ | +------+-----+ Host | | | | q1 | br0 | | +------+--+--+ | | | | | | | | +---------- q2 | | | | | +-------------------------------------------------+
One thing to note is that as soon as we connected q1
to the
bridge, the state changed to UP
ip link | grep br0:
11: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
This by itself is not that much fun, so lets create another veth, lets call them e1,e2 and lets connect e1 to the bridge.
ip link add e1 type veth peer name e2 ip link set up dev e1 ip link set up dev e2 ip link set e1 master br0 bridge link
11: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 2 13: e1@e2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 2
If we let one of the q-devices(q2, since the q1 is connected to the brige) go into the container as a device.
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" systemd-nspawn --network-interface=q2 -bD ${ARCH_DIR}
In the container we set the q2 interface to 10.0.0.2
using ip addr add 10.0.0.2/24 dev q2
command.
And on the host side we set the e2 interface to 10.0.0.1
using ip addr add 10.0.0.1/24 dev e2
command.
we get the following picture.
+------------------------------------------------------------+ | switch container | | +---+---+ +----------------+ | | e2-----------+e1 | q1+-----+ | | | | 10.0.0.2 +---+---+ +------+---- q2 | | | | 10.0.0.1 | | | +----------------+ | | | | | +------------------------------------------------------------+
If we now is inside the container we can ping the the e2 device
(ping 10.0.0.2
). And using the e2 device we can ping the
container (ping 10.0.0.1
) . We could, use the e2 device and
connect that to another container, and now we have a connection
between the two containers, why stop there we could create any
number of containers and connect one side of the veth to the
container and the other to the bridge, making a starshaped mesh
network. And another nice feature is that we could use
i.e. wireshark sniff the packets between e2 and q2 using the br0
as an interface. And why not adding another bridge and connect
the two bridges together and maybe even add vlan tags (see vlan
description), there are endless posibilities.
3.6.2.1. Routing
Lets first checkout what routing we have inside the container.
systemd-run --machine=first -t /usr/bin/ip route
The result might not be so suprising.
10.0.0.0/24 dev q2 proto kernel scope link src 10.0.0.1
What we learned here is that anything going to the network 10.0.0.??? is going via 10.0.0.2 which has been assigned to e1.
To be consequent, lets checkout the route from the host side
ip route get 10.0.0.1 oif e2
The above command will show you the route for getting to network address 10.0.0.1 using the outpt device e2. This obtains the not so surprising result:
10.0.0.1 dev e2 src 10.0.0.2 uid 0 cache
One problem that might occure is when setting the ip address of an interface, is forgetting the netmask on a interface. i.e
ip addr add 10.0.0.2 dev e2
ip route
default via 192.168.1.1 dev enp3s0 proto static 192.168.1.0/24 dev enp3s0 proto kernel scope link src 192.168.1.9
The problem here is that the command will by default add a /32
as netmask. Add the consequence of that is that no route will be
added for 10.0.0.???
network and is therefor not available. So
be sure to add th /24
or what ever netmask you are using
otherwise you might end up with spending hours of trying to
figure out why its not working. i.e. ip addr add 10.0.0.2/24 dev e2
.
3.6.3. Outside connection using physical interface
So we have our basic understanding of the bridge and setting up virtual ethernet devices. Though there are one thing that has been bothering us for a while. Our container is just sitting there without a connection to the outside world. In this case we will connect a real physical device to out bridge.
As the following overview picture shows.
+-------------------------------------------------+ | | | br0 +----------------+ | | 192.168.1.9 | | | | +-------+------+ | | | | |enp3s0 | q2 +------+-- q2 DHCP | | | +---+---+------+ | 192.168.1.? | | | | +----------------+ | | | | | | | +-----|-------------------------------------------+ | | +
Lets start from scratch and set up the veth
echo "----------Q-devices----------" ip link add q1 type veth peer name q2 ip link set up dev q1 ip link set up dev q2 ip link show q1 ip link show q2 ip link set q1 master br0 echo "----------Bridge connections----------" bridge link
----------Q-devices---------- 29: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 06:7e:60:58:f2:f3 brd ff:ff:ff:ff:ff:ff 28: q2@q1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 22:84:e8:48:b7:ba brd ff:ff:ff:ff:ff:ff ----------Bridge connections---------- 29: q1@q2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 2
Lets start the container with q2
as interface.
ARCH_NAME="first" ARCH_DIR="/mnt/data/calle/rootfs/${ARCH_NAME}" systemd-nspawn --network-interface=q2 -bD ${ARCH_DIR}
Its now time to connect our enp3s0
device which is a physical
device. If we just connect the interface enp3s0
to the bridge
br0
with the command ip link set enp3s0 master br0
we loose
the connection altogether, not even local network is available.
Why? In this case I'm not sure , I will update this if I figure it
out. Instead we remove the ip address from the interface
ip addr del 192.168.1.9 dev enp3s0
Then we attach the interface to the Bridge by doing.
ip link set enp3s0 master br0 ip addr add 192.168.1.9/24 dev br0 ip route add default via 192.168.1.1
- Add to master , adds the physical interface to bridge.
- Set bridge ip to the same as was assigned to the physical interface.
- Finally, set the default (not necessary if the route already exists).
We have one more thing todo, and that is set the Ip address of the container. We now have the choice of setting the ip address in the interface as usual, or we could even use dhcp, i.e.
systemd-run --machine=first -t /usr/bin/dhclient q2 systemd-run --machine=first -t /usr/bin/ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 28: q2@if29: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 22:84:e8:48:b7:ba brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 192.168.1.190/24 brd 192.168.1.255 scope global q2 valid_lft forever preferred_lft forever inet6 fe80::2084:e8ff:fe48:b7ba/64 scope link valid_lft forever preferred_lft forever
The container is now able to connect to both the bridge on the host and its connected entities and the surrounding network or even Internet. That also means that the outside network can connect to the container, i.e. running a webserver. To add another container is just a few step behind.
3.6.4. Other methods
You might ask yourself if there an easier way of doing this. And of course there is, I just wanted the background knowledge before.
Now that we have a bridge br0
we could just supply that as an argument
when starting our container. i.e.
systemd-nspawn --network-bridge=br0 -bD /mnt/data/calle/rootfs/first
This command is exactly the same as creating a veth and a bridge that we created in the Bridge section, and provide the container with one side, and connect the other to the bridge. Except it is much easier.
Another way of achiving connectivity is to use a macvlan interface (see macvlan). This is much eassier than the process of setting up a bridge. This also gives the container full access to the network. And in bridge mode the macvlan behaves exactly like what is explained above. In the following example I will use the bridge as an underlying network interface (just to show that it works).
systemd-nspawn --network-macvlan=br0 -bD /mnt/data/calle/rootfs/first
But usually this is done with a real physical interface. A macvlan interface is a virtual interface that adds a second MAC address tp an existing physical ethernet link.
You might wonder what if I want to add additional veth to my container.
lets consider this picture.
+---------------------------------------------------+ | +---------------------------------+ | | +-----+-------+ +--------+-----+ | | | | | Bridge | | | | | | | | +-----+-----+ | | | | | | e1 q1-+-+ q0 | p0 +--+-p1 e2 | | | | | | | | | | | | | | +-----+-----+ | | | | | container 2 | | container 1 | | | +-------------+ +--------------+ | | | +----------------------host-------------------------+
Where each of the container has a connection to a bridge, but it also has a private connection (veth) between the containers. The following commmand would work:
systemd-nspawn --network-bridge=br0 --network-veth-extra=e1:e2 -bD first systemd-nspawn --network-bridge=br0 --network-interface=e2 -bD second
- Container 1
- Adds a connection between the bridge and container via host0 interface (picture is called p1,p0), it also create an extra veth pair (e1/e2) where one(e1) is added to the container.
- Container 2
- Does the same connection to the bridge (q1,q0) but instead of creating a new veth pair it provides the interface through
network-interface