Skip to content

Commit 91ed793

Browse files
update enrichment tables docs (#113)
* update enrichment tables docs * added storage limit to the enrichment tables doc * update enrichment table docs
1 parent 2602b9b commit 91ed793

File tree

10 files changed

+250
-200
lines changed

10 files changed

+250
-200
lines changed
198 KB
Loading
148 KB
Loading

docs/images/use-enrichment-table.png

534 KB
Loading

docs/user-guide/.pages

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ nav:
99
- Dashboards: dashboards
1010
- Actions: actions
1111
- Functions: functions
12+
- Enrichment Tables: enrichment-tables
1213
- Real User Monitoring (RUM): rum.md
1314
- Identity and Access Management (IAM): identity-and-access-management
1415
- Management: management
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
nav:
2+
3+
- Enrichment Tables Overview: index.md
4+
- Create and Use Enrichment Tables: enrichment.md
5+
- Upload, Caching, and Restart Behavior: enrichment-table-upload-recovery.md
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
title: Upload, Caching, and Restart Behavior – OpenObserve
3+
description: Explains enrichment table upload, caching, and recovery behavior in OpenObserve based on file size and system settings.
4+
---
5+
This page describes how OpenObserve handles [enrichment table](../enrichment/) uploads, caching, and table loading behavior during node restart.
6+
7+
8+
## Upload Behavior
9+
The upload flow adapts based on the file size, controlled by the environment variable `ZO_ENRICHMENT_TABLE_MERGE_THRESHOLD_MB`.
10+
> Default value: 60 MB
11+
12+
### When the file is smaller than 60 MB
13+
- Uploaded to the metadata and file list storage system. For example, PostgreSQL.
14+
- A background job runs job at regular intervals to:
15+
16+
- Merge all enrichment files received in the last interval.
17+
- Create a single Parquet file.
18+
- Upload the merged file to the remote telemetry storage such as S3.
19+
20+
!!! info "To configure the interval:"
21+
Set the `ZO_ENRICHMENT_TABLE_MERGE_INTERVAL` environment variable.
22+
23+
- This variable defines how frequently the merge job runs.
24+
- The value is in seconds.
25+
- Default: 600
26+
27+
### When the file is 60 MB or larger
28+
29+
- Skips the metadata and file list storage system.
30+
- Directly uploads to remote telemetry storage such as S3.
31+
- No merging or background sync is involved in this path.
32+
33+
34+
35+
## Local Disk Cache
36+
After every enrichment table upload, OpenObserve caches the data locally to allow quick recovery and reduce remote fetches.
37+
38+
- Default path: `/data/openobserve/cache/enrichment_table_cache`
39+
- Configurable with: `ZO_ENRICHMENT_TABLE_CACHE_DIR`
40+
41+
This cache is the primary recovery source during node restarts.
42+
43+
44+
## Behavior on Node Restart
45+
When a node restarts, OpenObserve restores the enrichment table in the following order:
46+
47+
### If local disk cache is available
48+
49+
OpenObserve first checks whether a local disk cache is available. If found, it then verifies whether the cached enrichment table is up to date.
50+
51+
- If the cache is current, OpenObserve loads the enrichment table directly from the local disk into memory.
52+
- If the cache is outdated, OpenObserve proceeds with the same flow used when no local disk cache is available.
53+
54+
### If local disk cache is missing
55+
56+
When no local disk cache is available:
57+
58+
- OpenObserve sends a single search request to one of the querier nodes.
59+
- The querier fetches the latest enrichment data from the metadata database, such as PostgreSQL, and the remote storage system, such as S3. It then provides the data to the restarting node.
60+
61+
62+
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
---
2+
title: Enrichment Table – OpenObserve
3+
description: Learn how to enrich incoming or queried log data in OpenObserve using enrichment tables.
4+
---
5+
This page explains how to enrich incoming or queried log data in OpenObserve using enrichment tables.
6+
7+
## What Is an Enrichment Table
8+
An enrichment table in OpenObserve is a reference table used to enhance your log data with additional context. It is typically a CSV file that maps the fields from your logs to descriptive values.
9+
10+
You can use enrichment tables during:
11+
12+
- **Ingestion**: To add context as data is ingested.
13+
- **Query time**: To enrich data dynamically while querying.
14+
15+
**Enrichment is performed using [Vector Remap Language or VRL functions](https://vector.dev/docs/reference/vrl/).**
16+
17+
!!! note "Where to find"
18+
To access the enrichment table interface:
19+
20+
1. Select the appropriate organization from the dropdown in the top-right corner.
21+
2. Navigate to the left-hand menu.
22+
3. Select **Pipelines > Enrichment Tables**.
23+
24+
This opens the enrichment table management interface, where you can view, create, and manage enrichment tables available to the selected organization.
25+
26+
!!! note "Who can access"
27+
Access to enrichment tables is controlled via the **Enrichment Tables** module in the **IAM** settings, using **[role-based access control (RBAC)](../../identity-and-access-management/role-based-access-control/)**.
28+
29+
- **Root users** have full access by default.
30+
- Other users must be assigned access through **Roles** in **IAM**.
31+
- You can assign access to the entire **Enrichment Tables** module.
32+
- You can also assign permissions to individual enrichment tables. This allows fine-grained control over who can use or modify specific enrichment tables.
33+
34+
## Common Use Cases for Enrichment
35+
36+
Enrichment tables are often used to add human-readable context or derived values to logs. Examples include:
37+
38+
- **Country code to country name**: Add a new field that maps `IN` to `India`, `US` to `United States`, etc.
39+
40+
- **Status code to status label**: Add a new field that maps status `1` to `success`, `2` to `failure`, and `3` to `unknown`.
41+
42+
- **Internal vs external IP**: Add a new field that classifies the IP address as `internal` or `external` based on private IP ranges.
43+
44+
- **Protocol number to protocol name**: Add a new field that maps `6` to `TCP` and `17` to `UDP` using a protocol lookup table.
45+
46+
47+
48+
## How to Create and Use an Enrichment Table
49+
50+
### Step 1: Identify the Field to Enrich
51+
Review your log data and identify a field that contains codes or labels with limited context. <br>
52+
**Example** <br>
53+
The `log_iostream` field in the logs has values such as:
54+
55+
```json
56+
"log_iostream": "stdout"
57+
"log_iostream": "stderr"
58+
```
59+
> The goal is to create a new field, for example `stream_type_description`, that provides a readable explanation like:
60+
```json
61+
"log_iostream": "stderr"
62+
"stream_type_description": "Standard Error – error or diagnostic logs"
63+
```
64+
65+
### Step 2: Prepare the Enrichment Table
66+
Create a CSV file containing the original values and their corresponding descriptive meanings. Use clear and consistent column headers.
67+
Example CSV (`enrichment_reference.csv`)
68+
```cs
69+
log_iostream,stream_type_description
70+
stdout,Standard Outputapplication logs
71+
stderr,Standard Errorerror or diagnostic logs
72+
```
73+
74+
### Step 3: Upload the Enrichment Table
75+
76+
1. Go to **Pipelines > Enrichment Tables** in the OpenObserve UI.
77+
2. Click **Add Enrichment Table**.
78+
3. Set a name such as log_stream_labels.
79+
4. Upload your CSV file.
80+
5. Click **Save**.
81+
<br>
82+
![Upload the Enrichment Table](../../images/upload-enrichment-table.png)
83+
84+
The enrichment table is now available for use in VRL.
85+
86+
### Step 4: Use the Enrichment Table in a VRL Function
87+
1. Go to the **Logs** page.
88+
2. Select the relevant log stream.
89+
3. In the **VRL Function Editor**, enter the following:
90+
91+
```js linenums="1"
92+
record, err = get_enrichment_table_record("log_stream_labels", {"log_iostream": .log_iostream})
93+
.stream_type_description = record.stream_type_description
94+
.
95+
```
96+
97+
!!! note "Explanation:"
98+
**Line 1:** <br>
99+
100+
`record, err = get_enrichment_table_record("log_stream_labels", { "log_iostream": .log_iostream })`:
101+
102+
- This line searches the enrichment table named `log_stream_labels`.
103+
- It matches the field `log_iostream` in your log event with the `log_iostream` column in the enrichment table.
104+
- If a match is found, the corresponding row from the table is returned as record.
105+
- If no match is found or an error occurs, record will be empty and err will contain the error.
106+
107+
**Line 2:** <br>
108+
`.stream_type_description = record.stream_type_description`:
109+
110+
- This creates a new field called `stream_type_description` in your log event.
111+
- The value is taken from the `stream_type_description` column in the enrichment table row returned above.
112+
- If the enrichment table did not contain a matching entry, this field may not be added.
113+
114+
**Line 3:** <br>
115+
`.`
116+
117+
- This tells OpenObserve to return the modified log event, including the newly added field.
118+
119+
**Optional** <br>
120+
If you prefer to replace the original value instead of adding a new field, you can do:
121+
122+
```js linenums="1"
123+
record, err = get_enrichment_table_record("log_stream_labels", {"log_iostream": .log_iostream})
124+
.log_iostream = record.stream_type_description
125+
.
126+
```
127+
### Step 5: Run the Query and View the Results
128+
Click Run Query. A new field (such as stream_type_description) will appear in the results, containing the enriched meaning of the original value.
129+
<br>
130+
![Use the Enrichment Table](../../images/use-enrichment-table.png)
131+
132+
133+
## Use Enrichment Tables in Pipelines
134+
In addition to enriching data at query time, you can apply the same enrichment logic during ingestion using **Pipelines**. This allows you to permanently transform log records as they arrive, ensuring that enriched fields are stored along with the original data.
135+
136+
### How it works
137+
138+
- You define a pipeline with a **Transform** step that uses a VRL function.
139+
- The VRL function reads from an enrichment table, just like in the **Logs** UI.
140+
- The enriched field is added before the data is written to storage.
141+
142+
!!! note
143+
Use query-time enrichment when you want flexibility. Use ingestion-time enrichment when you want consistency and speed.
144+
145+
## Append Data to an Existing Table
146+
To add more data to an existing enrichment table, enable the **Append data to existing Enrichment Table** option before uploading your new CSV file:
147+
148+
1. In the left-hand navigation menu, select **Pipelines > Enrichment Tables**.
149+
2. Select the table you want to update.
150+
3. In the **Update Enrichment Table** view, select the new CSV file that contains the additional data.
151+
4. Turn on the **Append data to existing Enrichment Table** toggle.
152+
5. Select **Save** to upload and append the new data to the existing table.
153+
<br>
154+
![Append Data to an Existing Table](../../images/enrichment-table-append.png)
155+
156+
## Storage Limit
157+
The maximum size of an enrichment table is controlled by the environment variable `ZO_ENRICHMENT_TABLE_LIMIT`.
158+
159+
- Default value: `256` (in MB)
160+
- If the enrichment table exceeds this limit, you cannot append additional records. OpenObserve returns an error when the size threshold is reached.
161+
162+
## Upload Size Limit
163+
The maximum size of the data payload that can be uploaded at one time is controlled by the environment variable `ZO_PAYLOAD_LIMIT`.
164+
165+
- Default value:`209715200`(approximately 210 MB)
166+
- If you attempt to upload a payload larger than the configured limit, OpenObserve returns an error.
167+
168+
## Troubleshooting
169+
- **Field not enriched:** Ensure the enrichment table column name matches the log field and that the data types are compatible.
170+
- **No result added:** Check that the enrichment table was uploaded and saved correctly, and that a matching row exists.
171+
- **Permission denied:** Ensure the user has the correct permissions in the IAM role to access the enrichment table.
172+
173+
## Related Links
174+
175+
- [Upload, Caching, and Restart Behavior](../enrichment-table-upload-recovery/)
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Enrichment tables in OpenObserve allow you to add meaningful context to your log data by joining it with external reference data. These tables are uploaded as CSV files and can be used during ingestion or at query time to add or modify fields.
2+
3+
Learn more:
4+
5+
- [Enrichment Tables](../enrichment-tables/enrichment/)
6+
- [Upload, Caching, and Restart Behavior](../enrichment-tables/enrichment-table-upload-recovery/)

docs/user-guide/functions/.pages

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
nav:
22
- Functions Overview: index.md
33
- Functions in OpenObserve: functions-in-openobserve.md
4-
- Enrichment: enrichment.md
4+

0 commit comments

Comments
 (0)