Initializing
Overview
The fdctl configure
command is used to setup the host operator system so Firedancer can run correctly. It does the following:
- hugetlbfs Reserves huge and gigantic pages for use by Firedancer.
- sysctl Sets required kernel parameters.
- ethtool-channels Configures the number of channels on the network device.
- ethtool-gro Disable generic-receive-offload (GRO) on the network device.
- ethtool-loopback Disable tx-udp-segmentation on the loopback device.
The hugetlbfs
configuration must be performed every time the system is rebooted, to remount the hugetlbfs
filesystems, but sysctl
, ethtool-channels
and ethtool-gro
configuration only needs to be performed on the machine once.
The configure command is run like fdctl configure <mode> <stage>...
where mode
is one of:
init
Configures the provided stages if they are not already configured.check
Check if each stage is already configured. The command will exit with an error code if they are not.check
never requires privileges and will not make any changes to the system.fini
Unconfigure (reverse) the stage if it is reversible.
stage
can be one or more of hugetlbfs
, sysctl
, ethtool-channels
, ethtool-gro
, ethtool-loopback
and these stages are described below. You can also use the stage all
which will configure everything.
Stages have different privilege requirements, which you can see by trying to run the stage without privileges. The check
mode never requires privileges, and the init
mode will only require privileges if it needs to actually change something.
hugetlbfs
The hugetlbfs
stage is used to reserve huge
(2MiB) and gigantic
(1GiB) memory pages from the Linux kernel for use by Firedancer. See also the kernel documentation of these pages. Almost all memory in Firedancer is allocated out of these pages for performance reasons.
This is a two step process. First, the number of huge
and gigantic
pages available on the entire system is increased in the kernel by increasing /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages
until the free_hugepages
value is high enough for all the memory needs of the validator.
Once the pages have been reserved globally in the kernel pool, they are assigned specifically to Firedancer by creating a hugetlbfs
mount at each of /mnt/.fd/.gigantic/
and /mnt/.fd/.huge
for gigantic and huge pages respectively. These paths can be configured in the TOML file under the [hugetlbfs]
section. Lets run it:
$ sudo fdctl configure init hugetlbfs
NOTICE hugetlbfs ... unconfigured ... mounts `/mnt/.fd/.huge` and `/mnt/.fd/.gigantic` do not exist
NOTICE hugetlbfs ... configuring
NOTICE RUN: `mkdir -p /mnt/.fd/.huge`
NOTICE RUN: `mount -t hugetlbfs none /mnt/.fd/.huge -o pagesize=2097152,min_size=228589568`
NOTICE RUN: `mkdir -p /mnt/.fd/.gigantic`
NOTICE RUN: `mount -t hugetlbfs none /mnt/.fd/.gigantic -o pagesize=1073741824,min_size=27917287424`
$ cat /proc/mounts | grep \\.fd
none /mnt/.fd/.gigantic hugetlbfs rw,seclabel,relatime,pagesize=1024M,min_size=540092137472 0 0
none /mnt/.fd/.huge hugetlbfs rw,seclabel,relatime,pagesize=2M,min_size=95124124 0 0
This stage requires root privileges, and cannot be performed with capabilities. If the required hugetlbfs mounts are already present, with at least the amount of memory reserved that we required then the init
mode does nothing and the check
mode will return successfully without requiring privileges.
The fini
mode will unmount the two filesystems, and remove them from /mnt/.fd/
, although it will leave the /mnt/.fd/
directory in place. The fini
mode will not succeed if memory from the mounts is mapped into a running process.
If fini succeeds, the huge and gigantic pages that Firedancer had reserved will be returned to the kernel global pool so they can be used by other programs, but the global pool size will not be decreased, even if it was earlier increased during init
.
TIP
The hugetlbfs
step should be run immediately when the system is booted. If run later, it may fail because the operating system memory is fragmented and a large contiguous block cannot be reserved.
sysctl
It is suggested to run Firedancer with certain kernel parameters tuned for best performance. The sysctl
stage will check and configure these parameters. The stage will only increase values to meet the minimum, and will not decrease them if the minimum is already met.
Sysctl | Minimum | Required | Description |
---|---|---|---|
/proc/sys/vm/max_map_count | 1000000 | Yes | Agave accounts database requires mapping many files. |
/proc/sys/fs/file-max | 1024000 | Yes | Agave accounts database requires opening many files. |
/proc/sys/fs/nr_open | 1024000 | Yes | Agave accounts database requires opening many files. |
/proc/sys/net/ipv4/conf/lo/rp_filter | 2 | Yes | If sending QUIC transactions to Firedancer over loopback, this must be enabled to receive a response. Otherwise Linux will drop response packets due to limitations in the kernel eBPF networking stack. The sendTransaction RPC call will send over loopback. |
/proc/sys/net/ipv4/conf/lo/accept_local | 1 | Yes | If sending QUIC transactions to Firedancer over loopback, this must be enabled to receive a response. Otherwise Linux will drop response packets due to limitations in the kernel eBPF networking stack. The sendTransaction RPC call will send over loopback. |
/proc/sys/net/core/bpf_jit_enable | 1 | No | Firedancer uses BPF for kernel bypass networking. BPF JIT makes this faster. |
/proc/sys/kernel/numa_balancing | 0 | No | Firedancer assigns all memory to the right NUMA node, and rebalancing will make the system slower. |
Sysctls that are not required will produce a warning if they are not set correctly, but configuration will proceed and exit normally.
The init
mode requires either root
privileges, or to be run with CAP_SYS_ADMIN
. The fini
mode does nothing and kernel parameters will never be reduced or changed back as a result of running configure
.
ethtool-channels
In addition to XDP, Firedancer uses receive side scaling (RSS) to improve network performance. This uses functionality of modern NICs to steer packets to different queues to distribute processing among CPUs. See the kernel documentation for more information.
In Firedancer, each net
tile serves one network queue, so the ethtool-channels
stage will modify the combined channel count of the configured network device [tiles.net.interface]
to be the same as the number of net
tiles, [layout.net_tile_count]
. If your NIC does not support the required number of queues, you will need to reduce the number of net
tiles, potentially down to one for NICs which don't support queues at all.
The command run by the stage is similar to running ethtool --set-channels <device> combined <N>
but it also supports bonded devices. We can check that it worked:
$ sudo fdctl configure init ethtool-channels
NOTICE ethtool-channels ... unconfigured ... device `ens3f0` does not have right number of channels (got 1 but expected 2)
NOTICE ethtool-channels ... configuring
NOTICE ethtool-channels ... RUN: `ethtool --set-channels ens3f0 combined 2`
$ ethtool --show-channels ens3f0
Channel parameters for ens3f0:
Pre-set maximums:
RX: 64
TX: 64
Other: 1
Combined: 64
Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 2
The stage only needs to be run once after boot but before running Firedancer. It has no dependencies on any other stage, although it is dependent on the number of net
tiles in your configuration.
Changing device settings with ethtool-channels
requires root privileges, and cannot be performed with capabilities.
ethtool-gro
XDP is incompatible with a feature of network devices called generic-receive-offload
. This feature must be disabled for Firedancer to work.
The command run by the stage is similar to running ethtool --offload <device> generic-receive-offload off
but it also supports bonded devices. We can check that it worked:
$ sudo fdctl configure init ethtool-gro
NOTICE ethtool-gro ... unconfigured ... device `ens3f0` has generic-receive-offload enabled. Should be disabled
NOTICE ethtool-gro ... configuring
NOTICE ethtool-gro ... RUN: `ethtool --offload ens3f0 generic-receive-offload off`
$ ethtool --show-offload ens3f0 | grep generic-receive-offload
generic-receive-offload: off
The stage only needs to be run once after boot but before running Firedancer. It has no dependencies on any other stage.
Changing device settings with ethtool-gro
requires root privileges, and cannot be performed with capabilities.
ethtool-loopback
XDP is incompatible with localhost UDP traffic using a feature called tx-udp-segmentation
. This feature must be disabled when connecting Agave clients to Firedancer over loopback, or when using Frankendancer.
The command run by the stage is ethtool --offload lo tx-udp-segmentation off
. We can check that it worked:
$ sudo fdctl configure init ethtool-loopback
NOTICE ethtool-loopback ... unconfigured ... device `lo` has tx-udp-segmentation enabled. Should be disabled
NOTICE ethtool-loopback ... configuring
NOTICE ethtool-loopback ... RUN: `ethtool --offload lo tx-udp-segmentation off`
$ ethtool --show-offload lo | grep tx-udp-segmentation
tx-udp-segmentation: off
The stage only needs to be run once after boot but before running Firedancer. It has no dependencies on any other stage.
Changing device settings with ethtool-loopback
requires root privileges, and cannot be performed with capabilities.