Getting Started¶

Overview¶

DKubeTM is a portable, end-to-end, Kubeflow-based MLOps platform that enables data scientists to develop, tune, and deploy complex models. It is based on Kubernetes, and will run on-premises and on the most popular cloud Platforms. It has the same look, feel, and workflow on all of them, and migrating back and forth between providers is fast and simple.

This guide describes the process of installing and managing DKube on a cluster. It is assumed that a supported version of Kubernetes is installed on the cluster ( Prerequisites ) prior to installing DKube.

DKube Configuration¶

The cluster can include one or more master nodes and optional worker nodes.

The Master node coordinates the cluster, and can optionally contain GPUs
Each Worker node provides more resources, and is a way to expand the capability of the cluster

At least 1 Master node must be running for the cluster to be active. Worker nodes can be added and removed, and the cluster will continue to operate. The process to stop and restart the cluster is described in the section Restarting DKube After Cluster Restart

Installation Configuration¶

The installation can be run:

From the master node in the cluster, or
From a remote node that is not part of the cluster

The overall flow of installation is as follows:

Copy the required files to the installation node through Docker
Ensure that the installation node has passwordless access to all of the DKube cluster nodes
Execute the platform-specific setup steps as described in this document
Install DKube using Helm
Access DKube through a browser

Important

Even if the installation is executed from the master node on the cluster, passwordless access is still required to all of the nodes on the cluster, including the master node

DKube and Kubernetes¶

DKube requires Kubernetes to operate. This guide assumes that a supported version of Kubernetes has been installed on the cluster, as listed in the prerequisites section.

Prerequisites¶

Supported Platforms¶

The following OSs are supported:

Ubuntu 18.04

CentOS 7.9

Rook Ceph 1.4

Cluster nodes can include one of the following:

On-prem (bare metal or VM)

Google GCP

Amazon AWS

Microsoft Azure

The following Kubernetes platforms and versions are supported:

Amazon EKS

Rancher 2.4

Kubernetes 1.18

VMWare vSphere with Tanzu 1.2.1

Node Requirements¶

Installation Node Requirements¶

The installation node has the following requirements:

A supported operating system
Docker CE
Kubectl

Software help to install some of the required packages is provided at Software Package Help

DKube Cluster Node Requirements¶

The DKube Cluster nodes have the following requirements:

A supported operating system
Docker CE
Nodes should all have static IP addresses, even if the VM exists on a cloud
All nodes must be on the same subnet
All nodes must have the same user name and ssh key

Each node on the cluster should have the following minimum resources:

16 CPU cores
64GB RAM
Storage size is dependent on the programs and datasets, and should be large enough to handle the required data, but should be at least 400GB

Important

Only GPUs of the exact same type can be installed on a node. So, for example, you cannot mix an NVIDIA V100 and P100 on the same node. And even GPUs of the same class must have the same configuration (e.g. memory).

Important

The Nouveau driver should not be installed on any of the nodes in the cluster. If the driver is installed, you can follow the instructions in the section Removing Nouveau Driver

Access to the Cluster¶

In order to run DKube both during and after installation, a minimum level of security access must be provided from any system that needs to use the node. This includes access to the url in order to open DKube from a browser.

Protocol	Port Range	Source
TCP	30002	Access IP
TCP	32222	Access IP
TCP	32223	Access IP
TCP	32323	Access IP
TCP	32224	Access IP
TCP	32225	Access IP
TCP	6443	Access IP
TCP	443	Access IP
TCP	22	Access IP
All	0-65535	Private Subnet
ICMP	0-65535	Access IP

The source IP access range is in CIDR format. It consists of an IP address and mask combination. For example:

192.168.100.14/24 would allow IP addresses in the range 192.168.100.x
192.168.100.14/16 would allow IP addresses in the range 192.168.x.x

Getting the DKube Files¶

The files necessary for installation are pulled from Docker, using the following commands:

sudo docker login -u <Docker username>
Password: <Docker password>
sudo docker run --rm -it -v $HOME/.dkube:/root/.dkube ocdr/dkubeadm:<DKube version> init

Note

The docker credentials and DKube version number (x.y.z) are provided separately

This will create the folder $HOME/.dkube and copy the necessary files to the folder.

Note

The specific tools and files are used based on the platform-specific instructions described in this document

Platform-Specific Installation Instructions¶

Based on the platform and Kubernetes type, specific setup is required prior to installing DKube.

Kubernetes	Instructions
EKS	DKube-Specific Steps For an Existing for EKS Cluster
Rancher	DKube-Specific Steps For an Existing Rancher Cluster
Tanzu	DKube-Specific Steps for an Existing VMWare Tanzu Cluster