# Observability For Quality

Observability is crucial in **enabling fast development and innovation on the Infobip scale**. Without accessible, fast, and accurate insight into the state of each component/service, it's impossible to provide/ensure the quality of service that our customers expect.&#x20;

We use the most commonly used tools in observability stacks in IT systems. Because of [Infobip's global presence and scale](/handbook/start-here/infobip-at-a-glance.md), we constantly experiment with new and better solutions to support our growth.

Observability at Infobip includes:

* **Metrics monitoring** (VictoriaMetrics clusters, multiple Prometheus instances with a total of 300M active series);
* **Events monitoring** (Grafana Faro with 85M daily events;
* **Logging** (Azure Data Explorer with 1PB of logs);
* **Communication logs** (OpenSearch clusters with 100B documents).

The most common metrics cover service level indicators (SLIs) such as **traffic, error rate, latency, and saturation**.

Based on this data, every component/service has alerts to meet SLAs defined by the component/service team owner. Alerts are primarily created in Alertmanager/Prometheus and sent to OpsGenie. In OpsGenie, each team defines on-call schedules, notification policies, etc., according to their way of work.

Visualizations are another valuable tool for engineers to understand their systems' state better. The most used visualization tool in Infobip is **Grafana, with many data sources and over 4250 dashboards**.&#x20;

Having specific business use cases at such a scale, it was challenging to find a "commercial silver bullet", so in addition to all tools mentioned above, we created our own alerting tool to help us cover blind spots:

* Based on InfluxDB and Elasticsearch;
* Integrated with OpsGenie;
* Ability to create alerts based on any fixed or dynamic threshold and anomaly detection;
* Predefined diagnostic pages for each alert type.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://infobipengineering.gitbook.io/handbook/tech-stack-and-architecture/observability-for-quality.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
