<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="bbPress/1.0.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>DataStax Support Forums &#187; Topic: Bulkloading SSTables, including Solr content</title>
		<link>http://www.datastax.com/support-forums/topic/bulkloading-sstables-including-solr-content</link>
		<description>Software, Support, and Training for Apache Cassandra</description>
		<language>en-US</language>
		<pubDate>Wed, 22 May 2013 10:24:39 +0000</pubDate>
		<generator>http://bbpress.org/?v=1.0.3</generator>
		<textInput>
			<title><![CDATA[Search]]></title>
			<description><![CDATA[Search all topics from these forums.]]></description>
			<name>q</name>
			<link>http://www.datastax.com/support-forums/search.php</link>
		</textInput>
		<atom:link href="http://www.datastax.com/support-forums/rss/topic/bulkloading-sstables-including-solr-content" rel="self" type="application/rss+xml" />

		<item>
			<title>jas on "Bulkloading SSTables, including Solr content"</title>
			<link>http://www.datastax.com/support-forums/topic/bulkloading-sstables-including-solr-content#post-1746</link>
			<pubDate>Mon, 23 Apr 2012 20:06:35 +0000</pubDate>
			<dc:creator>jas</dc:creator>
			<guid isPermaLink="false">1746@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Given this read-only content of mine, is it possible to issue a Solr query with a consistency level of ONE?  I see in the DSE 2 docs that &#60;code&#62;cl=ONE&#60;/code&#62; can be specified when using the Solr HTTP API.  However, the example and discussion pertain to issuing and update, not a query.  Given I have RF&#38;gt;1, it seems that perhaps response time could be reduced slightly with cl=ONE vs. requiring waiting for two nodes to respond?&#60;/p&#62;
&#60;p&#62;Can the cl parameter be applied to a query?  What's the default?  I'm guessing LOCAL_QUORUM perhaps?&#60;/p&#62;
&#60;p&#62;Thanks,&#60;/p&#62;
&#60;p&#62;Jeff
&#60;/p&#62;</description>
		</item>
		<item>
			<title>jas on "Bulkloading SSTables, including Solr content"</title>
			<link>http://www.datastax.com/support-forums/topic/bulkloading-sstables-including-solr-content#post-1745</link>
			<pubDate>Mon, 23 Apr 2012 16:21:34 +0000</pubDate>
			<dc:creator>jas</dc:creator>
			<guid isPermaLink="false">1745@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Thanks Jake.&#60;/p&#62;
&#60;p&#62;The more I think about it, the more I like the notion of defining a new column family per quarter. When that's done and verified, signal the app in some manner to refer to that new core.  I could even make the name of the core tenant specific and allow a tenant individual some flexibility as to when to migrate to the new content.&#60;/p&#62;
&#60;p&#62;There are other non-search related column families as well that are associated with the quarterly update, but the same mechanism can be applied to them as well.&#60;/p&#62;
&#60;p&#62;Look forward to see what appears in 2.1. :)&#60;/p&#62;
&#60;p&#62;Jeff
&#60;/p&#62;</description>
		</item>
		<item>
			<title>tjake on "Bulkloading SSTables, including Solr content"</title>
			<link>http://www.datastax.com/support-forums/topic/bulkloading-sstables-including-solr-content#post-1744</link>
			<pubDate>Mon, 23 Apr 2012 15:39:11 +0000</pubDate>
			<dc:creator>tjake</dc:creator>
			<guid isPermaLink="false">1744@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Hi Jeff,&#60;/p&#62;
&#60;p&#62;I'll work backwards.  Correct, solrjson:[&#34;Human&#34;,&#34;Mouse&#34;,&#34;Rat&#34;,&#34;Cow&#34;] is how you define a multivalued field.&#60;/p&#62;
&#60;p&#62;Next, If you currently drop your solr index and rebuild it each quarter why not use a new column family per quarter and instruct your app to use the new one?  You can also take one node out of the cluster at a time (assuming RF&#38;gt;1) and rebuild the data on it then bring it back in.&#60;/p&#62;
&#60;p&#62;We are working on a workflow for migrating from one schema to another using the same data but not in 2.0.&#60;/p&#62;
&#60;p&#62;-Jake
&#60;/p&#62;</description>
		</item>
		<item>
			<title>jas on "Bulkloading SSTables, including Solr content"</title>
			<link>http://www.datastax.com/support-forums/topic/bulkloading-sstables-including-solr-content#post-1743</link>
			<pubDate>Mon, 23 Apr 2012 03:09:08 +0000</pubDate>
			<dc:creator>jas</dc:creator>
			<guid isPermaLink="false">1743@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;Hello:&#60;/p&#62;
&#60;p&#62;A major aspect of my application is that in addition to tenant provided content, there is a quantity of 'canonical content' that gets updated once a quarter.  I am making it available for both search and retrieval. I'm looking for a more production worthy way to accomplish this.&#60;/p&#62;
&#60;p&#62;Prior to the release of DSE 2.0, this content existed in both Cassandra (non-search) and Solr. For the former, the content build process generates SSTables using org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter, and the latter by generating native Solr XML formatted files. Being pre-production at the time, and only using a single Cassandra node, it was pretty straightforward to shutdown Cassandra, and replace the SSTables with the new ones.  Likewise for Solr, I could remove the old index and ingest the new Solr XML files.&#60;/p&#62;
&#60;p&#62;Regardless of DSE 2, moving to multiple Cassandra nodes requires more finesse than my procedure above. It seems what I need to do in that case is described here:&#60;/p&#62;
&#60;p&#62;&#60;a href=&#34;http://www.datastax.com/dev/blog/bulk-loading&#34; rel=&#34;nofollow&#34;&#62;http://www.datastax.com/dev/blog/bulk-loading&#60;/a&#62;&#60;/p&#62;
&#60;p&#62;However, Sylvain's blog post is dated August 1, 2011 and speaks of Cassandra 0.8.1. Is using sstableloader still the way to go in the Cassandra 1.0.x, and soon to be 1.1, world? I'm already generating the SSTables, but they need to be uploaded to the cluster rather than just a single node.&#60;/p&#62;
&#60;p&#62;With DSE 2, it sounds like I can continue to index the existing Solr XML files I generate, or more directly upload the new content directly into Cassandra, also using sstableloader. The search related keyspace and column family (Solr core) definition may or may not change as well for a given quarterly update.&#60;/p&#62;
&#60;p&#62;With Solr today, I can issue a deletion query and remove all documents, then process a series of files to define the new content, and at the end of that, issue a commit. Once that has finally completed, then new queries will use that data. Until then, current searches will continue to refer to the old content. Is there a way to accomplish something similar going through Cassandra via sstableloader?  I'm guessing the content will be updated as it goes, becoming eventually consistent across the cluster.&#60;/p&#62;
&#60;p&#62;I suppose another approach would be to create a new version of the core via the HTTP API, then bulk load the content, and then instruct my app to then refer to the new core.  The name of the current content core could be set in Cassandra itself, which the app will read and start using. At some point, the old core can be deleted.  I kind of like this approach since I'll only have to deal with SSTables and Cassandra, and not Solr XML etc.&#60;/p&#62;
&#60;p&#62;Finally, the canonical content is comprised of numerous multi-valued fields. If I pull up such a document field in the CLI, I see:&#60;/p&#62;
&#60;pre&#62;&#60;code&#62;(column=n_macromolecule_species, value=solrjson:[&#38;quot;Human&#38;quot;,&#38;quot;Mouse&#38;quot;,&#38;quot;Rat&#38;quot;,&#38;quot;Cow&#38;quot;], timestamp=1333552307242000)&#60;/code&#62;&#60;/pre&#62;
&#60;p&#62;Is that column value a standard I can make use of to define multi-valued fields, or is it some secret DSE Solr integration thing?&#60;/p&#62;
&#60;p&#62;Thanks!&#60;/p&#62;
&#60;p&#62;Jeff
&#60;/p&#62;</description>
		</item>

	</channel>
</rss>
