<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="bbPress/1.0.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>DataStax Support Forums &#187; User Favorites: cooptron</title>
		<link><a href='http://www.datastax.com/support-forums/profile/cooptron'>cooptron</a></link>
		<description>Software, Support, and Training for Apache Cassandra</description>
		<language>en-US</language>
		<pubDate>Fri, 24 May 2013 19:18:14 +0000</pubDate>
		<generator>http://bbpress.org/?v=1.0.3</generator>
		<textInput>
			<title><![CDATA[Search]]></title>
			<description><![CDATA[Search all topics from these forums.]]></description>
			<name>q</name>
			<link>http://www.datastax.com/support-forums/search.php</link>
		</textInput>
		<atom:link href="http://www.datastax.com/support-forums/rss/profile/" rel="self" type="application/rss+xml" />

		<item>
			<title>cooptron on "OpsCenter agent stuck on maxed thrift queue"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-stuck-on-maxed-thrift-queue#post-8000</link>
			<pubDate>Mon, 17 Dec 2012 21:26:13 +0000</pubDate>
			<dc:creator>cooptron</dc:creator>
			<guid isPermaLink="false">8000@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;We do have secondary indexes, I assume that would affect the repair details as well since it is depending on a compaction?  We will look forward to the new version!  &#60;/p&#62;
&#60;p&#62;Thanks,&#60;br /&#62;
Andrew
&#60;/p&#62;</description>
		</item>
		<item>
			<title>nickmbailey on "OpsCenter agent stuck on maxed thrift queue"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-stuck-on-maxed-thrift-queue#post-7999</link>
			<pubDate>Mon, 17 Dec 2012 19:44:37 +0000</pubDate>
			<dc:creator>nickmbailey</dc:creator>
			<guid isPermaLink="false">7999@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Andrew,&#60;/p&#62;
&#60;p&#62;Do any of your CFs have secondary indexes? There is a known bug where secondary index compaction tasks can cause the compaction/streaming details to stall indefinitely. That will be fixed in the upcoming 2.1.3 release.&#60;/p&#62;
&#60;p&#62;-Nick
&#60;/p&#62;</description>
		</item>
		<item>
			<title>cooptron on "OpsCenter agent stuck on maxed thrift queue"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-stuck-on-maxed-thrift-queue#post-7998</link>
			<pubDate>Mon, 17 Dec 2012 19:33:03 +0000</pubDate>
			<dc:creator>cooptron</dc:creator>
			<guid isPermaLink="false">7998@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;The symptoms are the repair/compaction/stream information on the cluster views gets &#34;stuck&#34;.  The percentages no longer move, existing repairs do not go away in Opscenter, even though the cassandra node is no longer repairing or compacting.  No new information shows up in opscenter for nodes that start repairs.  Basically that part of the agent appears to stall indefinitely, while the OS stats and basic ring information still works.&#60;/p&#62;
&#60;p&#62;The only ERROR line I show in the agent log is from the initial configuration, which happens on every restart and appears to be the auto-discover process of the thrift port.  It does connect to jmx on localhost&#60;/p&#62;
&#60;p&#62;ERROR [Initialization] 2012-12-17 12:24:01,871 MARK HOST AS DOWN TRIGGERED for host 10.1.1.43(10.1.1.43):9160&#60;br /&#62;
ERROR [Initialization] 2012-12-17 12:24:01,872 Pool state on shutdown: &#38;lt;ConcurrentCassandraClientPoolByHost&#38;gt;:{10.1.1.43(10.1.1.43):9160}; IsActive?: true; Active: 0; Blocked: 0; Idle: 0; NumBeforeExhausted: 1&#60;br /&#62;
ERROR [Initialization] 2012-12-17 12:24:01,878 Error when performing thrift operation: #&#38;lt;HectorException me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden pushed out to client.&#38;gt;&#60;br /&#62;
ERROR [Thread-5] 2012-12-17 12:24:01,879 Unable to connect to Cassandra #&#38;lt;HectorException me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden pushed out to client.&#38;gt;&#60;/p&#62;
&#60;p&#62;I turned logging up to debugging on our test cluster (we are seeing the same situation there, much less CF's) and I see it collecting metrics on a regular basis, but then randomly it will spam the following (multiple times a second).  Debugging level logging did not provide any additional information to the cause.  I can send full logs if you are interested.&#60;/p&#62;
&#60;p&#62; WARN [Thread-2] 2012-12-17 13:16:13,062 Thrift operation queue is full, discarding thrift operation&#60;br /&#62;
 WARN [Thread-2] 2012-12-17 13:16:13,062 271315 operations dropped so far.&#60;/p&#62;
&#60;p&#62;It doesnt appear that the issue is based on number of metrics, the &#34;stall&#34; happens in our test cluster with roughly 100 CF's, and I cranked down the metrics in production (using ignore_keyspaces in the opscenter server config), but the issue still exists.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>nickmbailey on "OpsCenter agent stuck on maxed thrift queue"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-stuck-on-maxed-thrift-queue#post-7996</link>
			<pubDate>Mon, 17 Dec 2012 17:50:58 +0000</pubDate>
			<dc:creator>nickmbailey</dc:creator>
			<guid isPermaLink="false">7996@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Andrew,&#60;/p&#62;
&#60;p&#62;You are correct that reducing the column families you collect metrics for will help with thrift operations being discarded.  Node information like compaction and streams is already separate from metric collection however. Those operations being discarded shouldn't be affecting that data. You are seeing compactions/streaming from nodetool that aren't showing up in OpsCenter?&#60;/p&#62;
&#60;p&#62;Are there any other errors in the agent log when you see this issue?&#60;/p&#62;
&#60;p&#62;-Nick
&#60;/p&#62;</description>
		</item>
		<item>
			<title>cooptron on "OpsCenter agent stuck on maxed thrift queue"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-stuck-on-maxed-thrift-queue#post-7995</link>
			<pubDate>Mon, 17 Dec 2012 16:58:14 +0000</pubDate>
			<dc:creator>cooptron</dc:creator>
			<guid isPermaLink="false">7995@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;We are seeing an issue with the opscenter agent where we stop receiving information related to cassandra activities in the opscenter server view (repairs, streams, compactions).  We do still see IO information and load information.  This correlates to log info on the agents stating that the thrift operations queue is full and it is dropping thrift requests.  If we restart the agent it will start sending all information again for a limited timeframe and then starts dropping thrift operations again.  We have quite a few column families (across all keyspaces, roughly 3000).  I know we could likely fix this by reducing the amount of column families that we want to see metrics for, but I was curious if there were some tuning knobs to either increase the polling interval between metrics gatherings, or increase the thrift queue?  Is there a way to put the operations information (repairs, compactions, streams) into a separate queue so it is not affected by the metrics gathering?&#60;/p&#62;
&#60;p&#62;Thanks,&#60;br /&#62;
Andrew
&#60;/p&#62;</description>
		</item>
		<item>
			<title>cooptron on "OpsCenter configuration for cassandra using dual networks"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-configuration-for-cassandra-using-dual-networks#post-7994</link>
			<pubDate>Mon, 17 Dec 2012 16:51:54 +0000</pubDate>
			<dc:creator>cooptron</dc:creator>
			<guid isPermaLink="false">7994@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;This did help, I was able to successfully configure the agents for bi-directional communication.  I do see another issue that I will open a new topic for.  Thanks again!
&#60;/p&#62;</description>
		</item>
		<item>
			<title>mbulman on "OpsCenter configuration for cassandra using dual networks"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-configuration-for-cassandra-using-dual-networks#post-7888</link>
			<pubDate>Fri, 07 Dec 2012 14:54:23 +0000</pubDate>
			<dc:creator>mbulman</dc:creator>
			<guid isPermaLink="false">7888@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Some information on the 3 interfaces/IPs involved on the agent side can be seen in step 3 here:  &#60;a href=&#34;http://www.datastax.com/docs/opscenter/configure/configure_multi_region&#34; rel=&#34;nofollow&#34;&#62;http://www.datastax.com/docs/opscenter/configure/configure_multi_region&#60;/a&#62;&#60;/p&#62;
&#60;p&#62;It sounds like, since the gossip network is unroutable, you'll want to configure agent_rpc_interface on each node to point to cassandra's rpc interface (by default, that will pull from the broadcast address configured in cassandra.yaml)&#60;/p&#62;
&#60;p&#62;If you're not able to bind to that interface on the agent side, you can configure agent_rpc_broadcast_address instead, which will allow the agent's http server to bind on a separate interface from the one opscenterd connects on.&#60;/p&#62;
&#60;p&#62;Hope that helps.  Let us know if you have any other questions.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>cooptron on "OpsCenter configuration for cassandra using dual networks"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-configuration-for-cassandra-using-dual-networks#post-7868</link>
			<pubDate>Thu, 06 Dec 2012 23:07:42 +0000</pubDate>
			<dc:creator>cooptron</dc:creator>
			<guid isPermaLink="false">7868@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;tupton,&#60;/p&#62;
&#60;p&#62;It appears that the agents are connecting to the localhost jmx/thrift ok, but opscenter is trying to communicate to the agents on the gossip interface of our cassandra cluster instead of the thrift/rpc interface.  This is causing half of the information in opscenter to be missing (storage, etc) even though all the agents are showing connected.&#60;/p&#62;
&#60;p&#62;here is the opscenterd log output.  It SHOULD be connecting to 172.23.58.80, etc, the gossip network (172.23.1.x) is unroutable on purpose, so the opscenter server can not access it. &#60;/p&#62;
&#60;p&#62;From what it seems, you are gathering up the node IPs via thrift calls, but that call is using autoDiscoverHosts instead of requesting the thrift IP address of the nodes, so it is defaulting to the gossip interface.&#60;/p&#62;
&#60;p&#62;What I setup in opscenter for the seed addresses:&#60;br /&#62;
172.23.58.80&#60;br /&#62;
172.23.58.81&#60;br /&#62;
172.23.58.82&#60;/p&#62;
&#60;p&#62;Output from opscenterd trying to connect to an agent:&#60;/p&#62;
&#60;p&#62;WARN: HTTP request &#60;a href=&#34;https://172.23.1.87:61621/cluster/datacenter?node_ip=172.23.1.81&#34; rel=&#34;nofollow&#34;&#62;https://172.23.1.87:61621/cluster/datacenter?node_ip=172.23.1.81&#60;/a&#62; failed: Connection was refused by other side: 111: Connection refused.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>tupton on "OpsCenter configuration for cassandra using dual networks"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-configuration-for-cassandra-using-dual-networks#post-7014</link>
			<pubDate>Thu, 18 Oct 2012 20:18:00 +0000</pubDate>
			<dc:creator>tupton</dc:creator>
			<guid isPermaLink="false">7014@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;cooptron,&#60;/p&#62;
&#60;p&#62;The agents use localhost for JMX by default, so you shouldn't have to change anything to set up opscenter like that. If using localhost is a problem, you can set &#34;jmx_host&#34; in address.yaml on each node to the IP that you need.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>cooptron on "OpsCenter configuration for cassandra using dual networks"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-configuration-for-cassandra-using-dual-networks#post-7013</link>
			<pubDate>Thu, 18 Oct 2012 20:08:46 +0000</pubDate>
			<dc:creator>cooptron</dc:creator>
			<guid isPermaLink="false">7013@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;We have our cassandra implementations setup so that nodes have two network interfaces, one for the gossip/jmx and one for thrift/app connections.  How can I configure OpsCenter to use different IP addresses for thrift connections and jmx connections?  As far as I can tell, you only specify a single IP address for cassandra nodes
&#60;/p&#62;</description>
		</item>

	</channel>
</rss>
