Using Proxmox to build a working Ceph Cluster


Proxmox Version  Used– 5.0

Hardware – Intel NUC x4 with 16 GB RAM each with SSD for the Proxmox O/S and 3TB USB disks for uses as OSDS’s

Note This is not a tutorial on Ceph or Proxmox, it assumes familiarity with both. The intent is to show how to rapidly deploy Ceph using the capabilities of Proxmox.

Steps

  1. Create a basic Promox Cluster
  2. Install Ceph
  3. Create a three node Ceph Cluster
  4. Configure OSDs
  5. Create RBD Pools
  6. Use the Ceph RBD Storage as VM space for proxmox

Creating the Proxmox Cluster

Initially a four node Proxmox cluster will be created. Within this configuration three of the Proxmox cluster nodes will be used to form a ceph cluster. This ceph cluster will, in turn, provides storage for various VMs used by Proxmox. The nodes in question are proxmox127, proxmox128 and proxmox129. The last three digits of the hostname correspond to the last octet of the node’s IP address. The network used is 192.168.1.0/24.

The first task is to create a normal Proxmox Cluster – as well as the three ceph nodes mentioned the Proxmox cluster will also involve a non ceph node proxmox126.

The assumption is that the Proxmox nodes have already been created. Create a /etc/hosts file and copy it to each of the other nodes so that the nodes are “known” to each other. Open a browser and point it to https://192.168.1.126:8006 as shown below.

Open a shell and create the Proxmox cluster.

Next add the remaining nodes to this cluster by logging on to each node and specifying anode where the cluster is running.

Check the status of the cluster

The browser should now show all the nodes.

Creating the ceph cluster

This cluster is an example of a hyper-converged cluster in that the Monitor nodes and OSD nodes exist on the same server. The ceph cluster will be built on nodes proxmox127, proxmox128 and proxmox129.

Install the ceph packages on each of the three nodes

Next specify the cluster network. Note this need only be specified on the first node.

# pveceph init –network 192.168.1.0/24

After this an initial ceph.conf
file is created in /etc/ceph/. Edit the file to change the default pool replication size from 3 to 2 and the default minimum pool size from 2 to 1. Also, this is a good time to make any other changes to the ceph configuration as the cluster has not been started yet. Typically enterprise users require a replication size of three but since this is a home system and usable capacity might be more of a consideration a replication size of two is used but of course the final choice is in the domain of the System Administrator.

# vi /etc/ceph

Create the monitor on all three nodes. Note it is possible to use just one node but for resiliency purposes three are better. This will start the ceph cluster.

The ceph.conf file on the initial ceph node will be pushed out to the other nodes as they create their own monitors.

The crush map and ceph.conf
can be shown from the GUI by selecting <Ceph> <Configuration>

Selecting <Ceph> à <Monitor> shows the Monitor configuration.

At this point the GUI can be used to create the Ceph OSD’s and pools. Note that the ceph cluster is showing an error status due to the fact that no OSD’s have been created yet.

===============================================================

Note if timeouts are observed at the OSD screen perform the following tasks –

#Nano /etc/apt/sources.list.d/pve-enterprise.list
In the file comment and add
#deb https://enterprise.proxmox.com/debian/pve stretch pve-enterprise
deb http://download.proxmox.com/debian/pve stretch pve-no-subscription
After update
>apt update && apt dist-upgrade
Next
Create Ceph Manager (mgr) on each Monitor host.
>pveceph createmgr      

Credit for this workaround should be given to the author at :

https://forum.proxmox.com/threads/ceph-osd-on-pve5-got-timeout-500-in-gui.36235/

=================================================================

 

Create an OSD by selecting <Ceph>-<OSD>-<Create OSD>

Eligible disks are shown in the drop-down box; in addition, there is a choice to use a journal disk or co-locate the journal on the OSD data disk. Note the OSD disks should be cleared before this stage using a tool such as parted. This is because ceph may be conservative in its approach to creating OSDs if it finds that there is existing data present on the candidate device.

Select <Create> and the system will begin to create the OSD.

The OSD screen now shows the newly formed OSD. Note the weight corresponds to the capacity and ceph uses this value to balance capacity across clusters. Since this is the first OSD it has been given an index of 0.

At this point the ceph cluster is still degraded. Continue adding OSDs until there is at least one OSD configured on each server node. After adding one OSD to each server the <Ceph> <OSD> screen looks like:

The main Ceph screen shows a healthy cluster now that the replication requirements have been met:

At a console prompt issue, the command ceph osd tree to see a similar view.

root@proxmox127:/etc/ceph# ceph osd tree
‘ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 8.18669 root default
-2 2.72890   host proxmox127
0 2.72890      osd.0 up 1.00000 1.00000
-3 2.72890   host proxmox128
1 2.72890      osd.1 up 1.00000 1.00000
-4 2.72890   host proxmox129
2 2.72890      osd.2 up 1.00000 1.00000

The next task is to create an Object Storage Pool. Select <Ceph> <Pools> <Create> and enter the appropriate parameters. Here the replication size is left at 3 since this is a temporary pool and it will be deleted shortly

After creation, the new pool will be displayed:

Note these commands can also be performed from the command line.
Next perform a short benchmark test to ensure basic functionality:
# rados bench -p objectpool 20 write
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 20 seconds or 0 objects
Object prefix: benchmark_data_proxmox127_12968
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 – 0
1 16 16 0 0 0 – 0
2 16 34 18 35.9956 36 0.519035 1.22814
3 16 53 37 49.3273 76 0.93994 1.10036
4 16 69 53 52.9938 64 0.91114 1.04655
5 16 86 70 55.9936 68 0.902308 1.02032
6 16 106 90 59.9931 80 0.622469 0.997476
7 16 121 105 59.9931 60 1.14761 0.9906
8 16 140 124 61.9929 76 0.725148 0.980396
9 16 157 141 62.6595 68 1.01548 0.975507
10 16 173 157 62.7926 64 0.781161 0.970356
11 16 193 177 64.3561 80 0.897278 0.959163
12 16 211 195 64.9923 72 0.993592 0.949374
13 16 230 214 65.8383 76 0.963193 0.944466
14 16 246 230 65.7065 64 0.813863 0.9372
15 16 264 248 66.1255 72 0.851179 0.934594
16 16 284 268 66.9921 80 0.868938 0.931225
17 16 299 283 66.5804 60 0.986844 0.932238
18 16 317 301 66.881 72 0.885554 0.931278
19 16 335 319 67.1501 72 0.873851 0.932255
2017-07-26 16:28:35.531937 min lat: 0.519035 max lat: 1.72213 avg lat: 0.926949
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
20 16 353 337 67.3921 72 0.839821 0.926949
Total time run: 20.454429
Total writes made: 354
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 69.2271
Stddev Bandwidth: 18.3716
Max bandwidth (MB/sec): 80
Min bandwidth (MB/sec): 0
Average IOPS: 17
Stddev IOPS: 4
Max IOPS: 20
Min IOPS: 0
Average Latency(s): 0.923226
Stddev Latency(s): 0.177937
Max latency(s): 1.72213
Min latency(s): 0.441373
Cleaning up (deleting benchmark objects)
Removed 354 objects
Clean up completed and total clean up time :1.034961

Delete the pool by highlighting it and selecting <Remove> and then follow the prompts.


Using Ceph Storage as VM space

In this example two pools will be used – one for storing images and the other for containers. Create the pools with a replication size of 2 and set the pg count at 128. This will allow the addition of further OSD’s later on. The pools will be named rbd-vms and rbd-containers.

Copy the keyring to the locations shown below: Note that the filename is /etc/pve/priv/ceph/<poolname>.keyring

Next from the GUI select <Datacenter> à <Storage> à <Add> and select <RBD> as the storage type.

At the dialogue enter the parameters as shown adding all three of the monitors.

For the container pool select <KRBD> and use <Container> for the content deselecting image.

Now the Server View pane shows the storage available to all nodes.

Select One of the nodes then <rbd-vms> < Summary>

Using the ceph pools for VM storage

First upload an image to one of the servers – in this example the VM will be created on node proxmox127. Select the local storage and upload an iso image to it.

In this example Ubuntu 17 server has been selected –

Now select <Create VM> from the top right hand side of the screen and respond to the prompts until the <Hard Disk> screen is reached. From the drop-down menu select <rbd-vms> as the storage for the Virtual Ubuntu machine.

Complete the prompts – and then select the VM <Hardware> and add a second 40 GB virtual disk to the VM again using rbd-vms.

Start the VM and install Ubuntu Server.

Note during the installation that both of the ceph based virtual disks are presented.

The ceph performance screen shows the I/O activity

Summary

Proxmox is highly recommended not just for home use as it is also an excellent Hypervisor that is used in many Production environments. There are a number of support options available and in the author’s opinion it is the easiest and fastest way to get started with exploring the world of Ceph

Comments and suggestions for future articles welcome!

This site uses Akismet to reduce spam. Learn how your comment data is processed.