Thanks again, however I will need couple of more days to try out due to other priorities.
How do you use composite columns (from cassandra) in a hive table?
(17 posts) (6 voices)-
Posted 3 months ago #
-
Finally, I am back to this. Sorry for the delay.
Composite Column Name = UTF8Type + LongType/IntegerType/UTF8Type
Note: For all above combinations of second column, I am getting the same exception.My Doubts:
- Looking at the exception log, "value" seems pretty odd. Is that a problem?
- createCompositeKey method in the UDF java code, value of third (end-of-component) and fourth (lastIsOne) is hard coded to '0' and 'true' respectively. Frankly, I didn't understand it completely, even when I read the java-doc of Composite type several times. Please advise.Here is the Exception from the Log:
2013-02-19 22:15:32,802 null map = 100%, reduce = 0% [2013-02-19 22:15:34,916] FATAL {ExecReducer} - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"CSCO","_col1":"price","_col2":1360821839400},"value":{"_col0":{"count":1,"sum":25.13}},"alias":0} at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:256) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:518) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:419) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:256) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: InvalidRequestException(why:Not enough bytes to read value of component 0) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:603) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:959) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:798) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:724) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247) ... 3 more Caused by: java.io.IOException: InvalidRequestException(why:Not enough bytes to read value of component 0) at org.apache.hadoop.hive.cassandra.output.CassandraAbstractPut.commitChanges(CassandraAbstractPut.java:69) at org.apache.hadoop.hive.cassandra.output.CassandraPut.write(CassandraPut.java:139) at org.apache.hadoop.hive.cassandra.output.HiveCassandraOutputFormat$1.write(HiveCassandraOutputFormat.java:69) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:589) ... 16 moreHere is my UDF based on your sample code.
package com.cisco.iep.hive.plugins; import java.nio.ByteBuffer; import org.apache.cassandra.utils.ByteBufferUtil; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.BytesWritable; public class WriteCompositeStringLong extends UDF{ public BytesWritable evaluate(final String strCol, final Long longCol) { ByteBuffer byteBuff = createCompositeKey(strCol, longCol, 0, false); return new BytesWritable(byteBuff.array()); } private ByteBuffer createCompositeKey(String strCol, Long longCol, int endOfComponent, boolean lastIsOne) { ByteBuffer bytes = ByteBufferUtil.bytes(strCol); int totalSize = 0; if (strCol != null) { totalSize += 2 + bytes.remaining() + 1; if (longCol != null) { totalSize += 2 + 8 + 1; if (endOfComponent != -1) { totalSize += 2 + 1 + 1; } } } ByteBuffer bb = ByteBuffer.allocate(totalSize); if (strCol != null) { bb.putShort((short) bytes.remaining()); bb.put(bytes); bb.put(longCol == null && lastIsOne ? (byte) 1 : (byte) 0); if (longCol != null) { //bb.putShort((short) 16); //bb.put(UUIDGen.decompose(intCol)); bb.putShort((short) 8); //8 for Long value bb.putLong(longCol); bb.put(endOfComponent == -1 && lastIsOne ? (byte) 1 : (byte) 0); if (endOfComponent != -1) { // We are putting a byte only because our test use ints that // fit in a byte *and* IntegerType.fromString() will // return something compatible (i.e, putting a full int here // would break 'fromStringTest') bb.putShort((short) 1); bb.put((byte) endOfComponent); bb.put(lastIsOne ? (byte) 1 : (byte) 0); } } } bb.rewind(); return bb; } }I could paste the all other missing pieces like Hive Create Table script, Source Table details etc.
I feel guilty of not resolving this issue yet even after your so much help from you already.
Really appreciate your time and help over this!!
Posted 2 months ago #
Reply
You must log in to post.
