DataStax OpsCenter Documentation

Provisioning New Nodes and Clusters

Using these methods, you can create new clusters or add nodes to an existing cluster in an automated fashion. The nodes may be created on existing machines and VMs, or OpsCenter can launch EC2 instances to host the nodes.

Provisioning Methods URL
Provision a new cluster POST /provision
Add nodes to an existing cluster POST /{cluster_id}/provision
Launch EC2 instances and provision a new cluster POST /launch
Launch EC2 instances to add nodes to a cluster POST /{cluster_id}/launch

Provisioning on Prepared Servers

POST /provision

Set up and start a new Cassandra or DSE cluster on existing machines.

The post body should be a JSON dictionary containing, at a minimum, the following required entries:

  • nodes: A list of IP addresses to set up as part of the cluster
  • cassandra_config: A Cassandra Config, which is a JSON dictionary representing the cassandra.yaml configuration for the nodes in the cluster.
  • install_params: A dictionary specifying what packages to install and the user credentials needed to access the machines using SSH. The following entries are accepted:
    • package: Either "dsc" or "dse"
    • version: The version of DSC or DSE to install
    • username: The username to use when SSHing to the machines
    • password: The password for username. This may be omitted if a private key is specified and the user has root privileges or can perform the necessary actions using sudo without specifying a password.
    • private_key: The text contents of the private key for username. This may be omitted if a password is specified.
    • private_key_file: The path to a private key file. This may be used in place of the private_key option.
    • repo-user: If installing DSE, this is the username that you use to access the DataStax enterprise repositories.
    • repo-password: If installing DSE, this is the password that you use to access the DataStax enterprise repositories.

The following entries are typically optional, but are sometimes required:

  • opscenter_config: A JSON dictionary representing the OpsCenter Cluster Config for the new cluster. If this option is omitted, the [cassandra]: seed_hosts option will be automatically populated and all other options will be left at their default values.

  • node_type_map: The role for each node that you specify when provisioning a DSE cluster. The entry should be a map from IP addresses to one of the following: "cassandra", "hadoop", or "solr". If omitted, "cassandra" is assumed.

  • topology_map: Specifications of the DCs and racks for the provisioned nodes, which you can pass as a map in the form {ip: [dc, rack]}. This argument is required when using a snitch other than SimpleSnitch, RackInferringSnitch, DseSimpleSnitch, or DseDelegateSnitch. The topology info is used for token selection purposes, but will not result in cassandra-topology.properties being filled out automatically.

  • token_map: A map of the form {ip: token} that overrides the default OpsCenter selection of balanced tokens for each node on a per-DC basis. OpsCenter does not currently pick balanced tokens automatically when multiple racks are in use. This should only be supplied when provisioning a cluster that is not using vnodes.

  • ssh_ip_map: A map of private interfaces to public interfaces used to deal with nodes that have different public and private interfaces (such as Amazon EC2).

  • accepted_fingerprints: A parameter that SSH automatically checks for each host. If the fingerprint for a host has not been seen before, the operation will fail with a status code of 409 unless that fingerprint is included in the accepted_fingerprints parameter. The format should be a map like {ip: <fingerprint>}, where <fingerprint> should be the the result of:

    ssh-keygen -lf /dev/stdin <<< "_ $(ssh-keyscan $HOST 2>/dev/null | cut -f2,3 -d ' ')"
    

    Alternatively, when a 409 status is returned, the result body will be a JSON dictionary with a fingerprints entry. The value of that entry is a map of {ip: <fingerprint>} that can be directly used as the value for the accepted_fingerprints parameter.

Returns a Request ID.

Example

When launching a three node DSC cluster, Our post body might look like:

{
    "cassandra_config": {
        "authenticator": "org.apache.cassandra.auth.AllowAllAuthenticator",
        "authority": "org.apache.cassandra.auth.AllowAllAuthority",
        "auto_snapshot": true,
        "cluster_name": "Test Cluster",
        "column_index_size_in_kb": 64,
        "commitlog_directory": "/var/lib/cassandra/commitlog",
        "commitlog_sync": "periodic",
        "commitlog_sync_period_in_ms": 10000,
        "compaction_preheat_key_cache": true,
        "compaction_throughput_mb_per_sec": 16,
        "concurrent_reads": 32,
        "concurrent_writes": 32,
        "data_file_directories": [
            "/var/lib/cassandra/data"
        ],
        "dynamic_snitch_badness_threshold": 0.1,
        "dynamic_snitch_reset_interval_in_ms": 600000,
        "dynamic_snitch_update_interval_in_ms": 100,
        "encryption_options": {
            "internode_encryption": "none",
            "keystore": "conf/.keystore",
            "keystore_password": "cassandra",
            "truststore": "conf/.truststore",
            "truststore_password": "cassandra"
        },
        "endpoint_snitch": "SimpleSnitch",
        "flush_largest_memtables_at": 0.75,
        "hinted_handoff_enabled": true,
        "hinted_handoff_throttle_delay_in_ms": 1,
        "in_memory_compaction_limit_in_mb": 64,
        "incremental_backups": false,
        "index_interval": 128,
        "initial_token": null,
        "key_cache_save_period": 14400,
        "key_cache_size_in_mb": null,
        "max_hint_window_in_ms": 3600000,
        "memtable_flush_queue_size": 4,
        "multithreaded_compaction": false,
        "partitioner": "org.apache.cassandra.dht.RandomPartitioner",
        "reduce_cache_capacity_to": 0.6,
        "reduce_cache_sizes_at": 0.85,
        "request_scheduler": "org.apache.cassandra.scheduler.NoScheduler",
        "row_cache_provider": "SerializingCacheProvider",
        "row_cache_save_period": 0,
        "row_cache_size_in_mb": 0,
        "rpc_keepalive": true,
        "rpc_port": 9160,
        "rpc_server_type": "sync",
        "rpc_timeout_in_ms": 10000,
        "saved_caches_directory": "/var/lib/cassandra/saved_caches",
        "snapshot_before_compaction": false,
        "ssl_storage_port": 7001,
        "storage_port": 7000,
        "thrift_framed_transport_size_in_mb": 15,
        "thrift_max_message_length_in_mb": 16,
        "trickle_fsync": false,
        "trickle_fsync_interval_in_kb": 10240
    },
    "install_params": {
        "username": "joe",
        "password": "somepassword",
        "package": "dsc",
        "version": "1.1.2"
    },
    "nodes": [
        "192.168.100.1",
        "192.168.100.2",
        "192.168.100.3"
    ]
}

Which we can use as follows:

curl -X POST
  localhost:8888/provision
  -d @provision.json

Output:

"da0794da-4a3a-11e2-b745-e0b9a54a6d93"

DSE Example

When launching a DSE cluster with two Cassandra nodes, one Hadoop node, and one Solr node, our post body might look like:

{
    "cassandra_config": {
        "authenticator": "org.apache.cassandra.auth.AllowAllAuthenticator",
        "authority": "org.apache.cassandra.auth.AllowAllAuthority",
        "auto_snapshot": true,
        ...
        "trickle_fsync_interval_in_kb": 10240
    },
    "install_params": {
        "username": "joe",
        "password": "somepassword",
        "package": "dse",
        "version": "2.2.1",
        "repo-user": "some-dse-username",
        "repo-password": "some-dse-password"
    },
    "nodes": [
        "192.168.100.1",
        "192.168.100.2",
        "192.168.100.3",
        "192.168.100.4"
    ],
    "node_type_map": {
        "192.168.100.1": "cassandra",
        "192.168.100.2": "cassandra",
        "192.168.100.3": "hadoop",
        "192.168.100.4": "solr"
    }
}
POST /{cluster_id}/provision

Add new Cassandra or DSE nodes to a cluster.

Path arguments:cluster_id -- A Cluster Config ID.

The post body should be the same as for POST /provision, but with the following differences:

  • opscenter_config is not accepted.
  • token_map is required when provisioning a non vnode cluster.

Returns a Request ID.

Example

When adding two nodes to a DSC cluster, the post body may look like:

{
    "cassandra_config": {
        "authenticator": "org.apache.cassandra.auth.AllowAllAuthenticator",
        "authority": "org.apache.cassandra.auth.AllowAllAuthority",
        "auto_snapshot": true,
        ...
        "trickle_fsync_interval_in_kb": 10240
    },
    "install_params": {
        "username": "joe",
        "password": "somepassword",
        "package": "dsc",
        "version": "1.1.2"
    },
    "nodes": [
        "192.168.100.4",
        "192.168.100.5"
    ],
    "token_map": {
        "192.168.100.4": "10",
        "192.168.100.5": "46248042897083394072353090607538141429"
    }
}

Provisioning with New EC2 Instances

POST /launch

Launch a set of new EC2 instances and provision a new cluster on them.

The post body should be a JSON dictionary with two entries: launch and provision.

"launch" dictionary

The launch entry should be a dictionary containing the following required entries:

  • ec2_access_id: The AWS Access ID for the account you wish to launch the instances with.

  • ec2_secret_key: The matching AWS Secret Key for the provided Access ID.

  • location: The AWS region to use for launching the instances. The following options are available:

    • "US East (Northern Virginia)"
    • "US West (Northern California)"
    • "US West (Oregon)"
    • "EU (Ireland)"
    • "Asia Pacific (Tokyo)"
    • "Asia Pacific (Singapore)"
    • "South America (Sao Paulo)"

    The default location is "US East (Northern Virginia)".

The following optional entries may also be used inside the launch dictionary:

  • zone: The availability zone to launch the instances in. For example, options for the US East region include "us-east-1a", "us-east-1b", "us-east-1c", "us-east-1d", and "us-east-1e". If omitted, an availability zone will be randomly chosen.
  • image_id: The AMI ID to use for the new instances. A good default choice for this is the DataStax AMI for the region. To see all options organized by region, look at <opscenterd_conf_dir>/definitions/ec2-instances-*.json>. By default, the DataStax AMI for US East will be used.
  • image_size: The type and size of instances to launch. Options include “m1.small”, “m1.large”, “c1.medium”, and so on. Some sizes are only available in select AWS regions. By default, m1.large will be used.
  • keypair: The name of an AWS keypair to use when launching the new EC2 instances. If omitted, a new key pair will be created with the name OpsCenterProvisioningKeyPair.
  • security_group: The name of an AWS keypair to use when launching the new EC2 instances. If omitted, a new security group named OpsCenterSecurityGroup will be created and used. This security group opens the default ports used by Cassandra, DSE, OpsCenter, and SSH.

"provision" dictionary

  • cassandra_config: A Cassandra Config, which is a JSON dictionary representing the cassandra.yaml configuration for the nodes in the cluster.
  • install_params: A dictionary specifying what packages to install and the user credentials needed to access the machines using SSH. The following entries are accepted:
    • package: Either "dsc" or "dse"
    • version: The version of DSC or DSE to install
    • username: The username to use when SSHing to the machines
    • private_key: The text contents of the private key for username. This may be omitted if a password is specified or if the keypair option was omitted, which results in a new key pair being created.
    • private_key_file: The path to a private key file. This may be used in place of the private_key option.
    • repo-user: If installing DSE, this is the username that you use to access the DataStax enterprise repositories.
    • repo-password: If installing DSE, this is the password that you use to access the DataStax enterprise repositories.

The following entries are typically optional, but are sometimes required:

  • opscenter_config: A JSON dictionary representing the OpsCenter Cluster Config for the new cluster. If this option is omitted, the [cassandra]: seed_hosts option will be automatically populated and all other options will be left at their default values.
  • node_type_counts: When provisioning a DSE cluster, this allows you to specify how many nodes of each role type should be provisioned. It should be a map of the form {node_type: count}, where node_type is one of "cassandra", "hadoop", or "solr". For example, if {"cassandra": 4, "hadoop": 2} were used, the cluster would contain six nodes, with four running just DSE and two running hadoop services as well. When launching DSC clusters, only the cassandra type should be used.
  • type_token_map: Normally, OpsCenter will select balanced tokens for each node on a per-DC basis. You can optionally override those selctions by passing in a map of the form {node_type: [token, token, ...]} where node_type is one of "cassandra", "hadoop", or "solr". This param should only be used when provisioning non vnode clusters.

Returns a Request ID.

Example

To launch a three node cluster in US East, availability zone 'a', our post body might look like this:

{
    "provision": {
        "cassandra_config": {
            "authenticator": "org.apache.cassandra.auth.AllowAllAuthenticator",
            "authority": "org.apache.cassandra.auth.AllowAllAuthority",
            "auto_snapshot": true,
            ...
            "trickle_fsync_interval_in_kb": 10240
        },
        "install_params": {
            "package": "dsc",
            "version": "1.1.2",
            "username": "ubuntu"
        },
        "node_type_counts": {
            "cassandra": 3
        }
    },

    "launch" {
        "ec2_access_id": "9BATIQ711LYY326X6191",
        "ec2_secret_key": "I2nr9Jk1welj1IJekIejkbnQ91JmajIJaRea2ajR",
        "location": "US East (Northern Virginia)",
        "zone": "us-east-1a",
        "image_id": "ami-6139e708",
        "image_size": "m2.xlarge",
        "keypair": "MyKeyPair",
        "security_group": "default"
    }
}
POST /{cluster_id}/launch

Launch a set of new EC2 instances in order to add new nodes to an Cassandra or DSE cluster.

Path arguments:cluster_id -- A Cluster Config ID.

The post body should be the same as the one used for POST /launch with the following differences in the provision dictionary:

  • opscenter_config should be omitted.
  • node_type_counts is required only when provisioning a vnode cluster
  • type_token_map is required when provisioning a non vnode cluster (and replaces node_type_counts).

Returns a Request ID.

Example

To launch two EC2 instances and add two Hadoop nodes and one Solr node to an existing DSE cluster, body might look like this:

{
    "provision": {
        "cassandra_config": {
            "authenticator": "org.apache.cassandra.auth.AllowAllAuthenticator",
            "authority": "org.apache.cassandra.auth.AllowAllAuthority",
            "auto_snapshot": true,
            ...
            "trickle_fsync_interval_in_kb": 10240
        },
        "install_params": {
            "package": "dse",
            "version": "2.2.1",
            "username": "ubuntu",
            "repo-user": "some-dse-username",
            "repo-password": "some-dse-password"
        },
        "type_token_map": {
            "hadoop": ["10", "86248042897083394072353090607538141429"],
            "solr": ["46248042897083394072353090607538141429"]
        }
    },

    "launch" {
        "ec2_access_id": "9BATIQ711LYY326X6191",
        "ec2_secret_key": "I2nr9Jk1welj1IJekIejkbnQ91JmajIJaRea2ajR",
        "location": "US East (Northern Virginia)",
        "zone": "us-east-1a",
        "image_id": "ami-6139e708",
        "image_size": "m2.xlarge",
        "keypair": "MyKeyPair",
        "security_group": "default"
    }
}