Routinator’s monitoring service provides comprehensive
metrics in Prometheus format at the
/metrics endpoint. Here you can
find an overview of all metrics and their meaning.
Seconds since the start of the last update.
Duration of the last update in seconds.
Seconds since the end of the last update.
The current serial number for data served to RTR clients.
Publication metrics are provided for all trust anchors and for each RPKI repository.
All metrics for trust anchors have a label
name, named after the Trust
Anchor Locator file name without the .tal extension, e.g. arin. All metrics
for repositories have a label
uri specifying the URI of the notification
file of the RRDP repository, or the base URI of the rsync repository.
The number of publication points per trust anchor. In most cases these will be the five Regional Internet Registries, but will include the trust anchors of any configured testbeds as well.
This metric has two labels: either
uri, followed by the
statewhich is valid or rejected.
Metrics for each configured trust anchor. In most cases these will be the five Regional Internet Registries, but will include the trust anchors of any configured testbeds as well.
This metric has three labels: either
uri, followed by
typefor the type of object, e.g. crl, and lastly
statedescribing its validity state, such as valid or stale.
- The types and states of objects can be:
manifest- The number of manifests for each of the states valid, invalid, stale and missing. A manifest is invalid if it is not correctly encoded, has expired or is not correctly signed by the issuing CA. It is considered stale if the current time is past the time an update to the manifest should have been issued. Whether a stale manifest is valid or invalid depends on configuration. By default a stale manifest is considered invalid.
crl- The number of certificate revocation lists for each of the states valid, invalid, stale and stray. A CRL is invalid if it is not correctly encoded or is not correctly signed by the issuing CA. It is considered stale if the current time is past the time an update to the manifest should have been issued. Whether a stale manifest is valid or invalid depends on configuration. By default a stale CRL is considered invalid. Lastly, each CA should only issue one CRL. This CRL should both be listed on the manifest and used by the manifest’s certificate itself. Any manifest listed on the manifest that is not also the manifest’s own CRL is considered a stray.
ca_cert- The number of Certificate Authority (CA) certificates with the state valid.
router_cert- The number of End Entity (EE) certificates found to be present and valid. This only refers to such certificates included as stand-alone files which are BGPsec router certificates.
roa- The number of Route Origin Attestations for each of the states valid and invalid.
gbr- The number of Ghostbusters Records for each of the states valid and invalid. Note that currently the content of a Ghostbuster Record is not checked.
other- The number of objects found that are not certificates (.cer), Certificate Revocation Lists (.crl), manifests (.mft), ROAs (.roa), or Ghostbuster Records (.gbr) and have the state invalid.
The following metrics all have just one label, either
name in case of a
trust anchor or
uri for repositories:
The number of VRPs found to be present and valid.
The number of duplicate VRPs resulting from ROAs containing the same authorisation.
Note that if a VRP appears in multiple trust anchors or repositories, which occurrence is considered the duplicate depends on the order of processing which may change between validation runs. Thus, this number may change unexpectedly.
The number of VRPs that are contributed by this trust anchor or repository to the final set provided to your routers. This is the total number of VRPs, minus the ones that are locally filtered, duplicate, and, if configured to be dropped, unsafe.
Rsync Update Metrics¶
For each repository updated via rsync the following values are given.
The status code returned by the rsync process. A value of 0 means the process has finished successfully. The meaning of other values depends on the rsync client used. Please refer to its documentation for further details.
The duration the rsync process was running in seconds.
RRDP Update Metrics¶
For each repository updated via RRDP the following values are given. All metrics
have a label
uri specifying the URI of the notification file of the RRDP
The overall status of the update. This will be 200 if the updated succeeded, 304 if no update was necessary because the data was already current, and any other value for a failed update. If the value is -1, it was not possible to reach the HTTPS server at all.
The status of retrieving the notification file. This is the first step of an RRDP update. A value of 200 indicates that the file was successfully retrieved. A value of 304 indicates that the file hasn’t changed since last update and no actual update is necessary. Any other value represents an error.
The status of retrieving the actual payload. This is the second step of an RRDP update and may either represent a single HTTPS request for the snapshot file or a series of HTTPS request for the sequence of delta files necessary to update from the last known state.
A value of 0 means that no payload retrieval was necessary. A value of 200 means that the update was successful. Any other value indicates an error. In case of a sequence of delta updates, this error may have been preceded by one or more successful requests.
The overall duration of the RRDP update in seconds.
The serial number stated by the RRDP server for the current data set. With each update the serial number is increased by one.
RTR Server Metrics¶
A number of metrics are provided describing the state of the included RTR server. These metrics are available whether the RTR server is actually enabled or not.
The number of currently open RTR connections.
The total number of bytes read from RTR connections. In other words, describes how much data has been sent by clients.
The total number of bytes written to RTR connections. In other words, describes how much data has been sent to clients.
The amount of seconds since last cache reset by a client address.
The number of of reset queries by a client address.
The number of of serial queries by a client address.
New in version 0.12.0:
HTTP Server Metrics¶
A number of metrics are provided describing the state of the included HTTP server.
The total number of connections made with the HTTP server.
The number of currently open connections. This should at least be 1 as there is a connection open when requesting the JSON metrics.
The total number of requests received and answered by the HTTP server.
The number of bytes read from and written to HTTP clients.