<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="bbPress/1.0.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>DataStax Support Forums &#187; User Favorites: tobz</title>
		<link><a href='http://www.datastax.com/support-forums/profile/tobz'>tobz</a></link>
		<description>Software, Support, and Training for Apache Cassandra</description>
		<language>en-US</language>
		<pubDate>Wed, 19 Jun 2013 05:39:57 +0000</pubDate>
		<generator>http://bbpress.org/?v=1.0.3</generator>
		<textInput>
			<title><![CDATA[Search]]></title>
			<description><![CDATA[Search all topics from these forums.]]></description>
			<name>q</name>
			<link>http://www.datastax.com/support-forums/search.php</link>
		</textInput>
		<atom:link href="http://www.datastax.com/support-forums/rss/profile/" rel="self" type="application/rss+xml" />

		<item>
			<title>nickmbailey on "OpsCenter agent locking up an entire core for hours on end"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-locking-up-an-entire-core-for-hours-on-end#post-6679</link>
			<pubDate>Wed, 26 Sep 2012 16:34:28 +0000</pubDate>
			<dc:creator>nickmbailey</dc:creator>
			<guid isPermaLink="false">6679@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Roel,&#60;/p&#62;
&#60;p&#62;We aren't able to reproduce the issue on our end. If you change the logginge level of the agent to DEBUG (configured in log4j.properties) and then upload the log from the agent when this happens it may help us see what the issue is.&#60;/p&#62;
&#60;p&#62;-Nick
&#60;/p&#62;</description>
		</item>
		<item>
			<title>roelb on "OpsCenter agent locking up an entire core for hours on end"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-locking-up-an-entire-core-for-hours-on-end#post-6503</link>
			<pubDate>Sat, 15 Sep 2012 09:19:06 +0000</pubDate>
			<dc:creator>roelb</dc:creator>
			<guid isPermaLink="false">6503@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;It feels like something timebased..&#60;/p&#62;
&#60;p&#62;On my laptop / personal test environment, I have set up a 2-node (vmware workstation) cluster and this bug can easily be recreated.&#60;/p&#62;
&#60;p&#62;You just put the laptop to sleep and wake it up again. &#60;/p&#62;
&#60;p&#62;This will cause the opscenter agent to utilize all CPU and eventually lock up the entire system.&#60;br /&#62;
All other VM-functionality can cope with the sleep event/period, the opscenter-agent can't handle it.&#60;/p&#62;
&#60;p&#62;Not an urgent bug ofcourse, since it's something you will never do in a production environment. But nevertheless something to be looked at. &#60;/p&#62;
&#60;p&#62;Currently first thing I do in the morning is logging on to the VM's and restarting the opscenter agent.&#60;/p&#62;
&#60;p&#62;&#60;strong&#62;Linux:&#60;/strong&#62;&#60;br /&#62;
Ubuntu 12.04.1 LTS&#60;br /&#62;
Linux ubuntu 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux&#60;/p&#62;
&#60;p&#62;&#60;strong&#62;Cassandra:&#60;/strong&#62;&#60;br /&#62;
dsc1.1&#60;/p&#62;
&#60;p&#62;Thanks in advance,&#60;/p&#62;
&#60;p&#62;Roel
&#60;/p&#62;</description>
		</item>
		<item>
			<title>tobz on "Support fine-grained node selection for agent installation"</title>
			<link>http://www.datastax.com/support-forums/topic/support-fine-grained-node-selection-for-agent-installation#post-2639</link>
			<pubDate>Tue, 03 Jul 2012 16:09:51 +0000</pubDate>
			<dc:creator>tobz</dc:creator>
			<guid isPermaLink="false">2639@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;When trying to have OpsCenter remotely log into nodes and install the monitoring agent, there is no option to select which nodes you want the agent installed on.&#60;/p&#62;
&#60;p&#62;For instance, I wanted to deploy the monitoring agent specifically to a server that I knew had been rebooted after the leap second change on Sunday morning.  However, I have servers in my cluster that haven't been rebooted yet.  If I could have selected which servers I wanted to deploy to, I wouldn't have had to go and reboot the servers or manually install the agent.&#60;/p&#62;
&#60;p&#62;This is a very specific example but I think the ability to deploy selectively will let users benefit from the automation you guys have built while not having to worry about deploying something to their whole cluster that they haven't entirely vetted yet.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>nickmbailey on "OpsCenter agent locking up an entire core for hours on end"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-locking-up-an-entire-core-for-hours-on-end#post-2579</link>
			<pubDate>Mon, 02 Jul 2012 16:00:53 +0000</pubDate>
			<dc:creator>nickmbailey</dc:creator>
			<guid isPermaLink="false">2579@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Glad you got it worked out quickly :).&#60;/p&#62;
&#60;p&#62;Let us know if you run into any more issues.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>tobz on "OpsCenter agent locking up an entire core for hours on end"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-locking-up-an-entire-core-for-hours-on-end#post-2578</link>
			<pubDate>Mon, 02 Jul 2012 15:55:44 +0000</pubDate>
			<dc:creator>tobz</dc:creator>
			<guid isPermaLink="false">2578@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;So it does indeed seem like this fixed the issue.  I now have the agent running on the same node that was used as a reference in my first post and it has yet to lock the box up.  Woop woop!&#60;/p&#62;
&#60;p&#62;Thanks for the quick response / solution. :)
&#60;/p&#62;</description>
		</item>
		<item>
			<title>nickmbailey on "OpsCenter agent locking up an entire core for hours on end"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-locking-up-an-entire-core-for-hours-on-end#post-2575</link>
			<pubDate>Mon, 02 Jul 2012 15:23:29 +0000</pubDate>
			<dc:creator>nickmbailey</dc:creator>
			<guid isPermaLink="false">2575@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;The timing of using it yesterday, coincides with the leap second bug being a potential cause. People have reported the above command  will resolve the problem, as well as fully rebooting the nodes.&#60;/p&#62;
&#60;p&#62;I also just realized that the forums formatted the command above strangely. It should be the code in this link:&#60;/p&#62;
&#60;p&#62;&#60;a href=&#34;http://pastebin.com/p64RVvdK&#34; rel=&#34;nofollow&#34;&#62;http://pastebin.com/p64RVvdK&#60;/a&#62;&#60;/p&#62;
&#60;p&#62;If it turns out that is not the problem let us know and we can debug further.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>tobz on "OpsCenter agent locking up an entire core for hours on end"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-locking-up-an-entire-core-for-hours-on-end#post-2572</link>
			<pubDate>Mon, 02 Jul 2012 14:34:52 +0000</pubDate>
			<dc:creator>tobz</dc:creator>
			<guid isPermaLink="false">2572@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;The issue has been present since I started using OpsCenter... which was mid-afternoon yesterday.  I started with OpsCenter 2.1 and just updated to 2.1.1 to try with the new version 1.9 of the agent... same thing.&#60;/p&#62;
&#60;p&#62;I'll give your suggested fix a try and see if that helps at all.
&#60;/p&#62;</description>
		</item>
		<item>
			<title>nickmbailey on "OpsCenter agent locking up an entire core for hours on end"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-locking-up-an-entire-core-for-hours-on-end#post-2571</link>
			<pubDate>Mon, 02 Jul 2012 14:26:35 +0000</pubDate>
			<dc:creator>nickmbailey</dc:creator>
			<guid isPermaLink="false">2571@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;When did you start seeing this issue? Is there any chance this is related to the leap second bug people have been seeing?&#60;/p&#62;
&#60;p&#62;&#60;a href=&#34;https://lkml.org/lkml/2012/6/30/122&#34; rel=&#34;nofollow&#34;&#62;https://lkml.org/lkml/2012/6/30/122&#60;/a&#62;&#60;/p&#62;
&#60;p&#62;The fix for that is to run:&#60;/p&#62;
&#60;p&#62;date; date &#60;code&#62;date +&#38;quot;%m%d%H%M%C%y.%S&#38;quot;&#60;/code&#62;; date;
&#60;/p&#62;</description>
		</item>
		<item>
			<title>tobz on "OpsCenter agent locking up an entire core for hours on end"</title>
			<link>http://www.datastax.com/support-forums/topic/opscenter-agent-locking-up-an-entire-core-for-hours-on-end#post-2568</link>
			<pubDate>Mon, 02 Jul 2012 13:28:53 +0000</pubDate>
			<dc:creator>tobz</dc:creator>
			<guid isPermaLink="false">2568@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Heyo all. :)&#60;/p&#62;
&#60;p&#62;First of all, OpsCenter is pretty sweet. Simple UI, gives me some good insight as a whole where I would otherwise be using CLI tools to try and aggregate stats in my head all at once... yuck. :P&#60;/p&#62;
&#60;p&#62;I'm unfortunately having a problem with OpsCenter, though, where the agent seems to lock up an entire core on the box for hours on end... seemingly at random.&#60;/p&#62;
&#60;p&#62;I have a four-node cluster running Cassandra 1.1 on a RightScale Ubuntu 10.04 LTS image. I had OpsCenter do the automatic SSH in and install the agent. At first, things were working fine, then two nodes went dead. They were pegged at 50% CPU for a few hours, with occasional drops of CPU usage back down to normal every 45 minutes or so. This lasted, overall, for 7 - 8 hours and then mysteriously the boxes were back to normal and the nodes went UP again. The CPU usage problem has manifested itself on all nodes thus far. I tried restarting the agents, and sometimes even when restarting it would immediately go into this problem and peg a core.&#60;/p&#62;
&#60;p&#62;I made sure to test the problem by running for 24 hours with the agents stopped. CPU barely went above 1% on average. As soon as I re-enabled all the agents.... the boxes locked up again and came back randomly a few hours later.&#60;/p&#62;
&#60;p&#62;One thing I noticed is that when the agent starts the lock the box up... I can &#34;fix&#34; the issue by bringing down the main OpsCenter instance.  For example, one agent in particular here started to lock up (at around 12:37) and I decided to try the upgrade to 2.1.1 to see if it remedies the issue... and I brought OpsCenter down around 13:02.  The&#60;br /&#62;
CPU usage drops back down to nearly nothing at around 13:04 and then the log finally gets some data again at 13:10 when a connection times out/closes unexpectedly.&#60;/p&#62;
&#60;p&#62;&#60;code&#62;&#60;br /&#62;
DEBUG [Thread-7] 2012-07-02 12:37:21,835 Connection shut down&#60;br /&#62;
ERROR [StompConnection receiver] 2012-07-02 13:10:57,019 Connection closed unexpectedly:&#60;br /&#62;
java.io.EOFException: reading verb&#60;br /&#62;
        at org.jgroups.protocols.STOMP.readFrame(STOMP.java:227)&#60;br /&#62;
        at org.jgroups.client.StompConnection.run(StompConnection.java:253)&#60;br /&#62;
        at java.lang.Thread.run(Thread.java:662)&#60;br /&#62;
&#60;/code&#62;&#60;/p&#62;
&#60;p&#62;Any thoughts? :(
&#60;/p&#62;</description>
		</item>

	</channel>
</rss>
