Skip to content
Snippets Groups Projects
Commit a677739b authored by Hahn Axel (hahn)'s avatar Hahn Axel (hahn)
Browse files

Merge branch 'update-docs' into 'master'

Update docs

See merge request !118
parents fe737e3a f27d95aa
Branches
No related tags found
1 merge request!118Update docs
......@@ -18,12 +18,13 @@
# 2023-06-08 v0.4 <axel.hahn@unibe.ch> get summary for cronflicts and problems
# 2023-06-09 v0.5 <axel.hahn@unibe.ch> deltaunit can be set as parameter
# 2023-06-13 v0.6 <axel.hahn@unibe.ch> no output on activity; update replication check
# 2023-06-16 v0.7 <axel.hahn@unibe.ch> update help text
# ======================================================================
. $(dirname $0)/inc_pluginfunctions
self_APPNAME=$( basename $0 | tr [:lower:] [:upper:] )
self_APPVERSION=0.6
self_APPVERSION=0.7
# --- other vars...
cfgfile=/etc/icingaclient/.psql.conf
......@@ -126,12 +127,12 @@ OPTIONS:
PARAMETERS:
-m method; valid methods are:
activity running processes and queries
conflicts Detected conflicts from pg_stat_database_conflicts
activity Count running processes and queries
conflicts Count of detected conflicts
dbrows Count of database row actions
diskblock Count of diskblocks physically read or coming from cache
problems Problems and troublemakers
replication Replication status (table output only)
problems Count of problems and troublemakers
replication Replication status and lag time
transactions Count of transactions over all databases
EXAMPLES:
......@@ -392,7 +393,7 @@ case "${sMode}" in
;;
*)
echo ERRROR: [${sMode}] is an INVALID mode
echo "ERRROR: [${sMode}] is an INVALID mode"
_usage
ph.abort
......
......@@ -32,6 +32,7 @@ There is one include script used by all checks:
* check_proc_mem
* check_proc_ressources
* check_proc_zombie
* [check_psqlserver](check_psqlserver.md)
* [check_reboot_required](check_reboot_required.md)
* check_sensuplugins
* check_smartstatus
......
......@@ -3,21 +3,22 @@
## Introduction
**check_psqlserver** is a plugin execute different checks on a postgreSql server instance.
The kind of check is defined by a paameter `-m METHOD`.
The kind of check is defined by a parameter `-m METHOD`.
## Requirements
The icinga user needs to connect to the database server.
* psql (cli tool)
* The icinga user needs to connect to the database server (see Installation).
## Syntax
`$ check_psqlserver [-i|-u|-m METHOD]`
```txt
./check_psqlserver
./check_psqlserver -h
______________________________________________________________________
CHECK_PSQLSERVER :: v0.6
CHECK_PSQLSERVER :: v0.7
(c) Institute for Medical Education - University of Bern
Licence: GNU GPL 3
......@@ -33,12 +34,12 @@ OPTIONS:
PARAMETERS:
-m method; valid methods are:
activity running processes and queries
conflicts Detected conflicts from pg_stat_database_conflicts
activity Count running processes and queries
conflicts Count of detected conflicts
dbrows Count of database row actions
diskblock Count of diskblocks physically read or coming from cache
problems Problems and troublemakers
replication Replication status (table output only)
problems Count of problems and troublemakers
replication Replication status and lag time
transactions Count of transactions over all databases
EXAMPLES:
......@@ -72,6 +73,17 @@ export PGHOST=localhost
export PGDATABASE=postgres
```
To test the connection run `./check_psqlserver -m activity`.
If the config was written and the connect fails then search for pg_hba.conf (/var/lib/pgsql/data/pg_hba.conf or /etc/postgresql/13/main/pg_hba.conf).
If local authentication for ipv4 and v6 is set to "ident"
```txt
host all all 127.0.0.1/32 ident
```
... try to set it to "md5" and restart the pgsql service.
## Checks
The checks are done on the server and summarize data from statistic tables for all databases.
......@@ -82,7 +94,29 @@ If you need to troubleshot and want to see which of your databases causes the tr
### activity
Show running processes and queries
Show count of running processeses and sum of process states.
Possible states in pg_stat_activity are:
* active: The backend is executing a query.
* idle: The backend is waiting for a new client command.
* idle in transaction: The backend is in a transaction, but is not currently executing a query.
* idle in transaction (aborted): This state is similar to idle in transaction, except one of the statements in the transaction caused an error.
* fastpath function call: The backend is executing a fast-path function.
* disabled: This state is reported if track_activities is disabled in this backend.
The check summarizes:
* total - the total count of all processes
* active - processes with state "active"
* idle - processes with state "idle", "idle in transaction" and "idle in transaction (aborted)"
* fastpath - processes with state "fastpath function call"
* other - count of psql base processes having no value in state column
The state of the check is always "OK".
To analyze a troublemaker on high number of processes run `select * from pg_stat_activity` to see the queries and the database name.
Example output:
```txt
./check_psqlserver -m activity
......@@ -96,7 +130,21 @@ select * from pg_stat_activity.
### conflicts
Detected conflicts from pg_stat_database_conflicts
Show number of detected conflicts from pg_stat_database_conflicts. The values are counters. Therefor there is a calculation per minute to find newly occured changes.
The columns in pg_stat_database_conflicts are:
* confl_tablespace bigint - Number of queries in this database that have been canceled due to dropped tablespaces
* confl_lock bigint - Number of queries in this database that have been canceled due to lock timeouts
* confl_snapshot bigint - Number of queries in this database that have been canceled due to old snapshots
* confl_bufferpin bigint - Number of queries in this database that have been canceled due to pinned buffers
* confl_deadlock bigint - Number of queries in this database that have been canceled due to deadlocks
The check summarizes all conflicts of all databases.
The check switches to "critical" if one of the delta values per min is <> 0.
Example output:
```txt
./check_psqlserver -m conflicts
......@@ -113,10 +161,22 @@ select * from pg_stat_database_conflicts.
|confltablespace=0;; confllock=0;; conflsnapshot=0;; conflbufferpin=0;; confldeadlock=0;;
```
### dbrows
Count of database row actions
Count of database row actions.
From pg_stat_database we read the following columns and add them for all databases.
* tup_fetched bigint - Number of live rows fetched by index scans in this database
* tup_inserted bigint - Number of rows inserted by queries in this database
* tup_updated bigint - Number of rows updated by queries in this database
* tup_deleted bigint - Number of rows deleted by queries in this database
The values are counters. Therefor there is a calculation per sec to find current changes.
The state of the check is always "OK".
Example output:
```txt
./check_psqlserver -m dbrows
......@@ -137,6 +197,17 @@ select * from pg_stat_database.
Count of diskblocks physically read or coming from cache
From pg_stat_database we read the following columns and add them for all databases.
* blks_read bigint - Number of disk blocks read in this database
* blks_hit bigint - Number of times disk blocks were found already in the buffer cache, so that a read was not necessary (this only includes hits in the PostgreSQL buffer cache, not the operating system's file system cache)
The values are counters. Therefor there is a calculation per sec to find current changes.
The state of the check is always "OK".
Example output:
```txt
./check_psqlserver -m diskblock
OK: Pgsql diskblock :: Count of diskblocks physically read or coming from cache (from pg_stat_database)
......@@ -153,6 +224,20 @@ select * from pg_stat_database.
Problems and troublemakers
From pg_stat_database we read the following columns and add them for all databases.
* conflicts bigint - Number of queries canceled due to conflicts with recovery in this database. (Conflicts occur only on standby servers; see pg_stat_database_conflicts for details.)
* deadlocks bigint - Number of deadlocks detected in this database
* checksum_failures bigint - Number of data page checksum failures detected in this database (or on a shared object), or NULL if data checksums are not enabled.
* temp_files bigint - Number of temporary files created by queries in this database. All temporary files are counted, regardless of why the temporary file was created (e.g., sorting or hashing), and regardless of the log_temp_files setting.
* temp_bytes bigint - Total amount of data written to temporary files by queries in this database. All temporary files are counted, regardless of why the temporary file was created, and regardless of the log_temp_files setting.
The values are counters. Therefor there is a calculation per min to find current changes.
The state of the check switches to critical if a minimum problem was detected in the delta value.
Example output:
```txt
./check_psqlserver -m problems
OK: Pgsql problems :: Problems and troublemakers (from pg_stat_database) ... OK, nothing was found
......@@ -173,10 +258,15 @@ select * from pg_stat_database.
Replication status.
It shows the defined replication and their status.
It switches to state warning if one of the replications is not "streaming".
Aditionally it fetches the maximum lag of write, flush and replay of all replications.
The state switches to warning if it is larger 1 sec (just experimental).
The state of the check switches "warning" if ...
* one of the replications is not "streaming"
* the maximum lag is larger 1 sec (just experimental).
Example output:
```txt
./check_psqlserver -m replication
......@@ -197,6 +287,17 @@ select * from pg_stat_replication.
Count of transactions over all databases
From pg_stat_database we fetch these columns and summarize it for all database:
* xact_commit bigint - Number of transactions in this database that have been committed
* xact_rollback bigint - Number of transactions in this database that have been rolled back
The values are counters. Therefor there is a calculation per sec to show the current speed.
The state of the check is always "OK".
Example output:
```txt
./check_psqlserver -m transactions
OK: Pgsql transactions :: Count of transactions over all databases
......@@ -208,3 +309,16 @@ select * from pg_stat_database.
|commit=0;; rollback=0;;
```
## Run a query on command line
As root or icingaclient user you can read the configuration for the database monitoring user (created with param `-i`).
In a terminal you can source the created config file. Then run a query using psql.
Example:
```txt
. /etc/icingaclient/.psql.conf
psql -c 'select * from pg_stat_activity'
```
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment