-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
Is there an existing issue for this?
- I have searched the existing issues
Environment
- Milvus version: master & 2.6
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:Current Behavior
Current Behavior
LoadCellBatchAsync in memory_planner.cpp splits cells into batches purely by cell count:
cells_per_batch = ceil(total_cells / parallel_degree)This ignores the actual per-cell Arrow memory footprint. When loading a large segment (e.g. 512MB+) containing variable-size fields such as ARRAY or VARCHAR with large elements, cells can have highly uneven in-memory sizes. The count-based heuristic assumes each cell is roughly FILE_SLICE_SIZE (16MB), but a single cell in an ARRAY column can decompress to much more.
Example with a large segment:
A 2GB segment with ARRAY fields may contain ~500K cells. With memory_limit=128MB and FILE_SLICE_SIZE=16MB, parallel_degree=8:
cells_per_batch = ceil(500000 / 8) ≈ 62.5K- If each cell's Arrow memory is ~4MB → each batch reads ~250GB
- 8 concurrent batches → ~2TB peak memory → OOM kill
Even a moderately large segment (512MB on disk) with ARRAY fields averaging 2MB/cell can produce batches exceeding available memory by 10x+, because the count-based split distributes cells evenly without regard to their actual memory cost.
Root Cause
CellSpec does not carry memory size information. The batch formation loop in LoadCellBatchAsync only checks current.cells.size() >= cells_per_batch, which is a count-based heuristic derived from memory_limit / FILE_SLICE_SIZE. This heuristic breaks for large segments with variable-size fields (ARRAY, large VARCHAR, JSON) where individual cells can be orders of magnitude larger than FILE_SLICE_SIZE.
Expected Behavior
Batch splitting should respect the memory_limit parameter by considering per-cell estimated Arrow memory size. Each batch should be capped at memory_limit bytes of estimated Arrow memory, producing more batches with fewer cells when cells are large. Fallback to count-based splitting when memory info is unavailable.
Steps To Reproduce
1. Create a collection with an ARRAY field (e.g. `Array<Float, max_capacity=4096>`)
2. Insert enough data to produce a large segment (512MB+ on disk, hundreds of thousands of rows)
3. Load the collection on a QueryNode with limited memory (e.g. 32GB)
4. Observe OOM or excessive memory usage — each batch reader allocates far more than `memory_limit`Milvus Log
No response
Anything else?
No response