A short troubleshooting guide to OpsCenter
Hi my name is Erin and I’m a QA engineer. I’ve been on the OpsCenter team for the last 6 months or so and these are some problems I’ve encountered and their possible solutions.
The following are things to check if you are not able to load OpsCenter in your browser, are not able to connect to your cluster, or notice other problems in OpsCenter.
- Make sure the opscenterd process is up and running. To check, run the following on the command line. ps -ef | grep opscenter
- Which browser are you using? Our supported browsers are Chrome, Firefox, and Safari.
- Are OpsCenter and the agents running? You might need to restart the agents after your cluster or OpsCenter was upgraded. Here are directions for restarting the agents.
- Is your firewall up? I’m guilty of this one. I sometimes forget to run “sudo service iptables stop” on my cluster that uses Red Hat. And remember to restart after disabling the firewall. Which port numbers are opened? This gave me problems when I was setting up my Amazon EC2 account. The solution was to make the ports as open as possible. In EC2, this would be your security groups. Important ports for OpsCenter can be found here.
- Is SSL mismatched? For example, your agents has SSL enabled and OpsCenter does not or vice versa. OpsCenter SSL documentation
What functionality is limited to the enterprise edition?
- You can provision DataStax Enterprise clusters.
- You can manage multiple clusters.
- Data backups
- Cluster report
- Diagnostics tarball
- Email notifications
Permissions and Backups
This is pertinent to people using OpsCenter and their DSE/Cassandra cluster was installed from a tarball. If you have this setup and try to run a backup, you will see something like below.
You will need to make sure the user running the agent has read/write permissions on the Cassandra data directory. Choose one of the following solutions:
- Add the user to the group that owns the directory.
- Run the opscenter-agent as root. Once opscenterd, Cassandra, and the agent are installed and running (but before trying to backup / restore anything), edit /etc/init.d/opscenter-agent on your cluster, replacing the USER= line with USER=”root”. Then sudo service opscenter-agent restart.
You should be able to run a backup and restore successfully now.
Resources and Documentation
Good online resources and documentation sure come in handy, especially when you have forgotten something. Here’s a list of some of most useful resources I know of.
When in doubt, do not be afraid to ask questions.