<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="bbPress/1.0.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>DataStax Support Forums &#187; Topic: Cassandra File System - Fixed Width Text File - Hive Integration Issue</title>
		<link>http://www.datastax.com/support-forums/topic/cassandra-file-system-fixed-width-text-file-hive-integration-issue</link>
		<description>Software, Support, and Training for Apache Cassandra</description>
		<language>en-US</language>
		<pubDate>Sat, 25 May 2013 18:00:35 +0000</pubDate>
		<generator>http://bbpress.org/?v=1.0.3</generator>
		<textInput>
			<title><![CDATA[Search]]></title>
			<description><![CDATA[Search all topics from these forums.]]></description>
			<name>q</name>
			<link>http://www.datastax.com/support-forums/search.php</link>
		</textInput>
		<atom:link href="http://www.datastax.com/support-forums/rss/topic/cassandra-file-system-fixed-width-text-file-hive-integration-issue" rel="self" type="application/rss+xml" />

		<item>
			<title>sam on "Cassandra File System - Fixed Width Text File - Hive Integration Issue"</title>
			<link>http://www.datastax.com/support-forums/topic/cassandra-file-system-fixed-width-text-file-hive-integration-issue#post-6659</link>
			<pubDate>Tue, 25 Sep 2012 19:04:33 +0000</pubDate>
			<dc:creator>sam</dc:creator>
			<guid isPermaLink="false">6659@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;I haven't been able to recreate the problems with the JDBC driver on either DSE 2.1 or 2.2. The only way I can get the behaviour you describe there is by running 'SHOW TABLES' from the hive cli, which triggers the bug I mentioned in the StackOverflow post. After that, queries with the JDBC driver error with &#34;Table not found 'employees'&#34;.&#60;/p&#62;
&#60;p&#62;Regarding the other issue - is it possible to remove that temporary directory from the CFS using the hadoop command line tools (i.e. bin/dse hadoop fs)? I don't how the permissions came to be incorrect, but I think Hadoop should recreate them on demand
&#60;/p&#62;</description>
		</item>
		<item>
			<title>tambalavanar on "Cassandra File System - Fixed Width Text File - Hive Integration Issue"</title>
			<link>http://www.datastax.com/support-forums/topic/cassandra-file-system-fixed-width-text-file-hive-integration-issue#post-6656</link>
			<pubDate>Tue, 25 Sep 2012 17:33:22 +0000</pubDate>
			<dc:creator>tambalavanar</dc:creator>
			<guid isPermaLink="false">6656@http://www.datastax.com/support-forums/</guid>
			<description>&#60;p&#62;I'm trying to read a fixed width text file stored in Cassandra File System (CFS) using Hive. I'm able to query the file when I run from hive client. However, when I try to run from Hadoop Hive JDBC, It says table is not available or bad connection. Below are the steps I followed.&#60;/p&#62;
&#60;blockquote&#62;&#60;p&#62;Input file (employees.dat):&#60;/p&#62;&#60;/blockquote&#62;
&#60;p&#62;&#60;code&#62;&#60;br /&#62;
2736Ambalavanar              Thirugnanam              BNYM-EAG       2005-05-091982-12-18&#60;br /&#62;
2737Anand                    Jeyamani                 BNYM-AST       2005-05-091984-07-12&#60;br /&#62;
3123Muthukumar               Rajendran                BNYM-EES       2009-08-121988-02-23&#60;br /&#62;
&#60;/code&#62;&#60;/p&#62;
&#60;blockquote&#62;&#60;p&#62;Starting Hive Client&#60;/p&#62;&#60;/blockquote&#62;
&#60;p&#62;&#60;code&#62;bash-3.2# dse hive;&#60;br /&#62;
Logging initialized using configuration in file:/etc/dse/hive/hive-log4j.properties&#60;br /&#62;
Hive history file=/tmp/root/hive_job_log_root_201209250900_157600446.txt&#60;br /&#62;
hive&#38;gt; use HiveDB;&#60;br /&#62;
OK&#60;br /&#62;
Time taken: 1.149 seconds&#60;br /&#62;
&#60;/code&#62;&#60;/p&#62;
&#60;blockquote&#62;&#60;p&#62;Creating Hive External Table pointing to fixed width format text file&#60;/p&#62;&#60;/blockquote&#62;
&#60;p&#62;&#60;code&#62;&#60;br /&#62;
hive&#38;gt; CREATE EXTERNAL TABLE employees (empid STRING, firstname STRING, lastname STRING, dept STRING, dateofjoining STRING, dateofbirth STRING)&#60;br /&#62;
    &#38;gt; ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'&#60;br /&#62;
    &#38;gt; WITH SERDEPROPERTIES (&#34;input.regex&#34; = &#34;(.{4})(.{25})(.{25})(.{15})(.{10})(.{10}).*&#34; )&#60;br /&#62;
    &#38;gt; LOCATION 'cfs://hostname:9160/folder/';&#60;br /&#62;
OK&#60;br /&#62;
Time taken: 0.524 seconds&#60;br /&#62;
&#60;/code&#62;&#60;/p&#62;
&#60;blockquote&#62;&#60;p&#62;Do a select * from table.&#60;/p&#62;&#60;/blockquote&#62;
&#60;p&#62;&#60;code&#62;&#60;br /&#62;
hive&#38;gt; select * from employees;&#60;br /&#62;
OK&#60;br /&#62;
2736    Ambalavanar                     Thirugnanam                     BNYM-EAG        2005-05-09      1982-12-18&#60;br /&#62;
2737    Anand                           Jeyamani                        BNYM-AST        2005-05-09      1984-07-12&#60;br /&#62;
3123    Muthukumar                      Rajendran                       BNYM-EES        2009-08-12      1988-02-23&#60;br /&#62;
Time taken: 0.698 seconds&#60;br /&#62;
&#60;/code&#62;&#60;/p&#62;
&#60;blockquote&#62;&#60;p&#62;Do a select with specific fields from hive table throws permission error (first issue)&#60;/p&#62;&#60;/blockquote&#62;
&#60;p&#62;&#60;code&#62;&#60;br /&#62;
hive&#38;gt; select empid, firstname from employees;&#60;br /&#62;
Total MapReduce jobs = 1&#60;br /&#62;
Launching Job 1 out of 1&#60;br /&#62;
Number of reduce tasks is set to 0 since there's no reduce operator&#60;br /&#62;
java.io.IOException: The ownership/permissions on the staging directory cfs:/tmp/hadoop-root/mapred/staging/root/.staging is not as expected. It is owned by root and permissions are rwxrwxrwx. The directory must be owned by the submitter root or by root and permissions must be rwx------&#60;br /&#62;
        at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:108)&#60;br /&#62;
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)&#60;br /&#62;
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)&#60;br /&#62;
        at java.security.AccessController.doPrivileged(Native Method)&#60;br /&#62;
        at javax.security.auth.Subject.doAs(Subject.java:416)&#60;br /&#62;
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)&#60;br /&#62;
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)&#60;br /&#62;
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)&#60;br /&#62;
        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)&#60;br /&#62;
        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)&#60;br /&#62;
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)&#60;br /&#62;
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)&#60;br /&#62;
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)&#60;br /&#62;
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)&#60;br /&#62;
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)&#60;br /&#62;
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)&#60;br /&#62;
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)&#60;br /&#62;
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)&#60;br /&#62;
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)&#60;br /&#62;
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)&#60;br /&#62;
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&#60;br /&#62;
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)&#60;br /&#62;
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&#60;br /&#62;
        at java.lang.reflect.Method.invoke(Method.java:616)&#60;br /&#62;
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)&#60;br /&#62;
Job Submission failed with exception 'java.io.IOException(The ownership/permissions on the staging directory cfs:/tmp/hadoop-root/mapred/staging/root/.staging is not as expected. It is owned by root and permissions are rwxrwxrwx. The directory must be owned by the submitter root or by root and permissions must be rwx------)'&#60;br /&#62;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask&#60;br /&#62;
&#60;/code&#62;&#60;/p&#62;
&#60;p&#62;The second issue is, when I try to run the select * query from JDBC Hive driver (outside of dse/cassandra nodes), It says the table employees is not available. The external table created acts like a temporary table and it does not get persisted. When I use 'hive&#38;gt; show tables;', the employees table is not listed. Can anyone please help me figure out the problem?
&#60;/p&#62;</description>
		</item>

	</channel>
</rss>
