Open
Description
I am using a 4GB dataset and cannot upload it to Meilisearch. Neither the engine nor the importer provides clear error messages to explain what is going wrong.
MRE
I am using the db-benchmarks dataset. Due to GitHub’s limitations, I cannot upload the file here, so I will outline the steps to download it locally:
- Clone the db-benchmarks repository:
git clone https://github.com/db-benchmarks/db-benchmarks.git
- Download the dataset:
cd db-benchmarks/tests/hn_small
./prepare_csv/prepare.sh | while IFS= read -r line; do echo -e "\t$line"; done
- Run Meilisearch:
docker run -it --rm -p 7700:7700 -d getmeili/meilisearch:v1.12
- Create and configure the index:
curl -s -X POST "http://localhost:7700/indexes" -H 'Content-Type: application/json' \
--data-binary "{\"uid\": \"hn_small\", \"primaryKey\": \"id\"}"
curl -s -X PATCH "http://localhost:7700/indexes/hn_small/settings" \
-H 'Content-Type: application/json' \
--data-binary '{"pagination":{"maxTotalHits":2000},"searchableAttributes":["story_text","comment_text","story_author","comment_author"],"filterableAttributes":["comment_ranking","story_author"],"sortableAttributes":["comment_ranking","author_comment_count","story_id","comment_id"],"typoTolerance":{"enabled":false}}'
- Run the importer:
./meilisearch-importer --url 'http://localhost:7700' --index hn_small --files ./data/data.csv --batch-size 90MB
- Confirm that the data wasn’t uploaded:
curl -s http://localhost:7700/indexes/hn_small/stats
{"numberOfDocuments":0,"isIndexing":false,"fieldDistribution":{}}
Logs
From the logs, I can see errors, but they don’t provide any meaningful explanation of what went wrong:
2025-01-28T08:48:08.923880Z INFO HTTP request{method=POST host="localhost:7700" route=/indexes/hn_small/documents query_parameters= user_agent=ureq/2.9.6 status_code=202}: meilisearch: close time.busy=17.4ms time.idle=1.50s
2025-01-28T08:48:08.927445343Z 2025-01-28T08:48:08.927360Z INFO index_scheduler: A batch of tasks was successfully completed with 0 successful tasks and 1 failed tasks.
2025-01-28T08:48:12.498362053Z 2025-01-28T08:48:12.498246Z INFO HTTP request{method=POST host="localhost:7700" route=/indexes/hn_small/documents query_parameters= user_agent=ureq/2.9.6 status_code=202}: meilisearch: close time.busy=19.7ms time.idle=1.49s
2025-01-28T08:48:12.501504720Z 2025-01-28T08:48:12.501430Z INFO index_scheduler: A batch of tasks was successfully completed with 0 successful tasks and 1 failed tasks.
2025-01-28T08:48:15.633547596Z 2025-01-28T08:48:15.633356Z INFO HTTP request{method=POST host="localhost:7700" route=/indexes/hn_small/documents query_parameters= user_agent=ureq/2.9.6 status_code=202}: meilisearch: close time.busy=19.8ms time.idle=1.30s
2025-01-28T08:48:15.637011471Z 2025-01-28T08:48:15.636957Z INFO index_scheduler: A batch of tasks was successfully completed with 0 successful tasks and 1 failed tasks.
2025-01-28T08:48:31.814040173Z 2025-01-28T08:48:31.813684Z INFO HTTP request{method=GET host="localhost:7700" route=/indexes/hn_small/stats query_parameters= user_agent=curl/8.7.1 status_code=200}: meilisearch: close time.busy=900µs time.idle=201µs
Metadata
Metadata
Assignees
Labels
No labels