Node Details¶
Under Construction
This page is still being written
Virtual Machines¶
Our virtual machines (VM's) are provisioned by ITS using VMWare vSphere. The vSphere hypervisor client interface is where we control the state of a VM (stopped, running, paused, snapshots, etc..).
The base image for all nodes is CentOS 7.9, patched periodically with security updates by an Ansible script. The one exception is a legacy Windows 2012R2 server that allows Course Production to use Respondus from their Mac laptops.
The LTC's servers are located in the ITS machine rooms of the Burnaby and the Downtown campus. IP's with a subnet of .76 are located on the Burnaby campus, while those with .110 are located at the Downtown campus.
Kubernetes¶
Most nodes are configured to be members of a Kubernetes cluster, and details of the LTC's setup are outlined on the Kubernetes page. Other VM's have different, specialized roles, described below.
VM (Node) Roles¶
RKE2 is deployed on nodes destined to be Kubernetes members using the rke2-ansible playbook, and the configuration files are located in the VM Node Configuration project.
The ansible-rke2 collection is used to configure and provision the kubernetes clusters. See the ansible-rke2/inventory folder for a list of the current nodes, categorized by cluster.
- See the RKE2 documentation.
After deploying RKE2, Rancher is installed via Helm using the Helm chart found in the Rancher project.
- See the helm-install-rancher
values.yamlfile which is deployed using Terraform
Manager nodes¶
- schedule and coordinate the deployment of workloads to available worker nodes
Worker nodes¶
- contribute disk storage to the persistent disk storage provisioner, Longhorn
- run container workloads
Longhorn¶
Longhorn is used by all worker nodes
Longhorn is a persistent disk provisioner for Kubernetes; it allows pods to request storage that remains available independent of the pod lifecycle. Regardless of the cluster, any node with a worker role (including the combo role) has an additional block device that is dedicated for use by Longhorn.
ansible-node-configuration/playbooks/03_add_lvm_device.yaml is used to configure the Longhorn block device settings. Devices are configured with an LVM-type disk so that it's easy to append more storage if it is needed later.
- In the
prodcluster, the operator is deployed with the configuration files in theLonghornproject. This project also stores the terraform state files used to manage this service. - In the
dev_cp,dev_vsm, andstagingclusters, the operator is deployed using the GUI: see details in theCluster Explorerview underApps & Marketplace.
Load Balancer Roles¶
(?:prod|dev)-gate* nodes run the community version of Traefik as a layer 4 load balancer (LB). DNS entries are updated programmatically using an Ansible script; details are in the traefik-docker project.
- The LB's have
firewalldconfigured to block most traffic. See thebase-firewalld-configpath in the VM Node Configuration project for playbooks that set this up.
Legacy Systems¶
Prod1¶
This is the original production web server that has existed since 2010 to serve the LTC's web apps. We are transitioning workloads off this Apache-based web server into containers that run on Kubernetes.
Legacy Kubernetes¶
In 2018 a proof-of-concept Kubernetes cluster was built to research and validate Kubernetes technology. This set of servers was deployed with an old version of Rancher that uses Docker engine to run workloads. StorageOS was deployed as a persistent storage provisioner.
The cluster was installed without configuration files; all configuration was added through the GUI and is not recorded.
Research/Experimental¶
Some example projects being researched:
-
Nomad server
Nomad is an alternative workload orchestrator that has been deployed as an R&D project. See https://learn.hashicorp.com/nomad
-
Waypoint server
Waypoint is a development/deployment/release pipeline tool that has been deployed as a Nomad workload for R&D. See https://learn.hashicorp.com/waypoint.
-
Consul service mesh
-
HAProxy Ingress Controller