syncthingcsi/README.md
dave 45144e6808
All checks were successful
Gitea/syncthingcsi/pipeline/head This commit looks good
update readme
2024-08-22 18:12:48 -07:00

4.1 KiB

syncthing-csi

Hyperconverged csi storage for Kubernetes via Syncthing

Deploy

  1. Generate the default config for syncthing. See example in ./deploy/terraform/example_default_config.xml.

    This is largely Syncthing's default config, with features that may access external networks disabled - such as relays and crash reporting - and enabling syncing file permissions.

  2. Deploy using terraform. ./deploy/terraform/ is a terraform module. Example inputs:

    inputs = {
        syncthing_default_config = file("default_config.xml")
        syncthing_api_key        = "xyz"  # must match the api key in the above config  
    }
    
  3. Label 1 or more nodes as storage nodes (see below)

    kubectl label node nodename syncthing.csi.davepedu.com/syncthing-storage=yes
    

How it works

Syncthing is a tool that shares and synchronizes directories over peer-to peer network connections. This program, Syncthing-csi, maps Kubernetes Persistent Volumes to shared directories and leverages Syncthing to manage files. Syncthing-csi runs alongside Syncthing as a Daemonset in a Kubernetes cluster.

Nodes with the syncthing-storage=yes are "data nodes" and store a replica of every Syncthing-csi volume. It is therefore recommended to label multiple nodes with this label to ensure data durability in case of node failure.

When Kubernetes creates a Persistent Volume, it calls on Syncthing-csi to create it which in turn calls on Syncthing to create a new shared directory - often called shared "folders".

When Kubernetes wants to run a Pod with a Syncthing-csi volume mounted, it calls on Syncthing-csi to mount the volume. Syncthing-csi handles this by adding the shared folder to the node's Syncthing instance, waiting for it to sync, and bind mounts the folder where Kubernetes wants it.

A deployment contains:

  • Controller deployment: handles creation/deletion of shared folders in response to PVC changes
    • External-attacher: Kubernetes-provided tool that calls Syncthing-csi's RPC for VolumeAttachments
  • CSI daemonset: handles volume manipulation and CSI api calls
    • Syncthing: file synchronization server
    • CSI: Contains subprocesses:
      • Resharer: re-shares existing data to new data nodes
      • Deleter: deletes on-disk data for volumes that are being removed
      • NodeWatcher: handles transitioning nodes between being a data node or not
      • NodeAPI: provides an API on the node the controller requires for some operations
    • Introducer: Introduces new Syncthing instances to the rest of the cluster
    • Registrar: Kubernetes-provided tool that provides setup plumbing

Options

The terraform module accepts additional inputs to configure namespace and resources and enable verbose logging. There are additional options that can customize CSI behavior:

  • driver_name: Defaults to syncthing.csi.davepedu.com. This is the name this CSI driver will register itself as within the Kubernetes cluster. Note that the label for storage nodes, in deployment step 3 above, also contains this string and will need to be adjusted accordingly.
  • host_data_dir: Defaults to /var/syncthing-csi. Absolute path on the host under which all Syncthing and CSI data is stored for this driver.
  • name_suffix: No default value. String to append to the names of resources. Some items, such as ClusterRole, must have a unique name.
  • node_label: No default value. Restrict operation to nodes that have this label. Example: syncthing-experiment=true

Note that by adjusting driver_name, host_data_dir, and name_suffix, it is possible to run multiple instances of this CSI driver in the same Kubernetes cluster.

Development

TODO list

  • Review TODOs in code and below
  • High-level command interface
    • ShareFolderToNode() etc
  • Look into ReclaimPolicy and how to use it - we should be able to support Recycle mode (with a newer kube sdk...)
  • Derive host/volume IDs (e.g. from nodeID or other input?) (what did I mean by this?)
  • Timeouts on most stuff
  • Look into VolumeContentSource
  • Add Watch() in addition to the 30 second polling bits
  • Test on K3s, Kind https://github.com/kubernetes-sigs/kind