Using DSE Search/Solr is memory-intensive. This discovery process is intended to help you, the DSE Search/Solr administrator, develop a plan for having sufficient memory resources to meet the needs of your users.
First, you estimate how large your Solr index will grow by indexing a number of documents on a single node, executing typical user queries, and then examining the field cache memory usage for heap allocation. Repeat this process using a greater number of documents until you get a feel for the size of the index for the maximum number of documents that a single node can handle. You can then determine how many servers to deploy for a cluster and the optimal heap size. The index should be stored on SSDs or should fit in the system IO cache.
You need to have the following hardware and data:
A node with:
Create a schema.xml and solrconfig.xml.
Start a node.
Add N docs.
Run a range of queries that simulate those of users in a production environment.
View the status of field cache memory to discover the memory usage.
View the size of the index (on disk) included in the status information about the Solr core.
Based on the server's system IO cache available, set a maximum index size per-server.
Based on the memory usage, set a maximum heap size required per-server.
Calculate the maximum number of documents per node based on #6 and #7.
When the system is approaching the maximum docs per-node, add more nodes.