diff --git a/docs/20_Checks/_index.md b/docs/20_Checks/_index.md index 0b318f19f2ead0664bfaaa1428dff82844205fc6..b54fa7893243776184d8f099b8cacb4ee67a4430 100644 --- a/docs/20_Checks/_index.md +++ b/docs/20_Checks/_index.md @@ -11,7 +11,7 @@ There is one include script used by all checks: * check_backup_one * [check_ceph_diskfree](check_ceph_diskfree.md) * [check_ceph_io](check_ceph_io.md) -* check_ceph_osd +* [check_ceph_osd](check_ceph_osd.md) * check_ceph_status * check_clientbackup * check_couchdb-lb diff --git a/docs/20_Checks/check_ceph_osd.md b/docs/20_Checks/check_ceph_osd.md new file mode 100644 index 0000000000000000000000000000000000000000..84868caeab00260815f2e98dc7567494dca00efc --- /dev/null +++ b/docs/20_Checks/check_ceph_osd.md @@ -0,0 +1,95 @@ +# Check Ceph OSDs + +## Introduction + +Show cheph osd status: how many OSDs exist and how many are up/ down. +This check sends performance data. + +On your cluster you might want to increase the values for warning and +critical level. + +## Requirements + +* `ceph` binary and sudo permission on it to get the information + +## Syntax + +```txt +______________________________________________________________________ + +CHECK_CEPH_OSD +v1.5 + +(c) Institute for Medical Education - University of Bern +Licence: GNU GPL 3 +______________________________________________________________________ + +Show cheph osd status: how many OSDs exist and how many are up/ down. +This check sends performance data. + +On your cluster you might want to increase the values for warning and +critical level. + +SYNTAX: +check_ceph_osd [-w WARN_LIMIT] [-c CRITICAL_LIMIT] + +OPTIONS: + -h or --help show this help. + -w VALUE warning level (default: 1) + -c VALUE critical level (default: 2) + +EXAMPLE: +check_ceph_osd + no parameters; normal usage to get the ceph osd status + +check_ceph_osd -c 10 + change to critical level if 10 osds are down. + +``` + +## Examples + +`$ check_ceph_osd` returns + +```txt +OK: Check of available OSDs - 30 OSDs total .. 30 up .. 0 down (Limits: warn at 1; critical 2) +ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF + -1 53.67825 root default + -3 9.31496 host ceph01 + 0 ssd 1.86299 osd.0 up 1.00000 1.00000 + 6 ssd 1.86299 osd.6 up 1.00000 1.00000 + 12 ssd 1.86299 osd.12 up 1.00000 1.00000 + 18 ssd 1.86299 osd.18 up 1.00000 1.00000 + 24 ssd 1.86299 osd.24 up 1.00000 1.00000 + -5 8.73299 host ceph02 + 1 ssd 1.74660 osd.1 up 1.00000 1.00000 + 7 ssd 1.74660 osd.7 up 1.00000 1.00000 + 13 ssd 1.74660 osd.13 up 1.00000 1.00000 + 19 ssd 1.74660 osd.19 up 1.00000 1.00000 + 25 ssd 1.74660 osd.25 up 1.00000 1.00000 + -7 8.73299 host ceph03 + 2 ssd 1.74660 osd.2 up 1.00000 1.00000 + 8 ssd 1.74660 osd.8 up 1.00000 1.00000 + 14 ssd 1.74660 osd.14 up 1.00000 1.00000 + 20 ssd 1.74660 osd.20 up 1.00000 1.00000 + 26 ssd 1.74660 osd.26 up 1.00000 1.00000 + -9 8.73299 host ceph04 + 3 ssd 1.74660 osd.3 up 1.00000 1.00000 + 9 ssd 1.74660 osd.9 up 1.00000 1.00000 + 15 ssd 1.74660 osd.15 up 1.00000 1.00000 + 21 ssd 1.74660 osd.21 up 1.00000 1.00000 + 27 ssd 1.74660 osd.27 up 1.00000 1.00000 +-11 9.31496 host ceph05 + 5 ssd 1.86299 osd.5 up 1.00000 1.00000 + 11 ssd 1.86299 osd.11 up 1.00000 1.00000 + 17 ssd 1.86299 osd.17 up 1.00000 1.00000 + 23 ssd 1.86299 osd.23 up 1.00000 1.00000 + 29 ssd 1.86299 osd.29 up 1.00000 1.00000 +-13 8.84938 host ceph06 + 4 ssd 1.86299 osd.4 up 1.00000 1.00000 + 10 ssd 1.74660 osd.10 up 1.00000 1.00000 + 16 ssd 1.74660 osd.16 up 1.00000 1.00000 + 22 ssd 1.74660 osd.22 up 1.00000 1.00000 + 28 ssd 1.74660 osd.28 up 1.00000 1.00000 + |osd-total=30;;;0;30 osd-up=30;;;0;30 osd-down=0;;;0;30 +...