I am running single node - but with a listen address != localhost
I think that this is stopping cfs:// from connecting correctly for IO operations. Two examples below, one from pig, the second from command line brisk hadoop fs -ls
The second one looks to be a parsing error.
p
--- LOG in PIG
grunt> entities = LOAD 'cassandra://CassBuilderPRJT/rtbEnts' USING CassandraStorage() AS (ents,columns: bag {T: tuple(col, val)});
grunt> B = GROUP entities by ents;
grunt> X = FOREACH B GENERATE COUNT(entities);
grunt> dump X;
2011-07-01 15:22:34,242 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY
2011-07-01 15:22:34,243 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-07-01 15:22:34,514 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: X: Store(cfs://null/tmp/temp1522760494/tmp1523872576:org.apache.pig.impl.io.InterStorage) - scope-15 Operator Key: scope-15)
2011-07-01 15:22:34,536 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2011-07-01 15:22:34,552 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer - Choosing to move algebraic foreach to combiner
2011-07-01 15:22:34,606 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2011-07-01 15:22:34,606 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2011-07-01 15:22:34,845 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2011-07-01 15:22:34,875 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-01 15:22:36,827 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2011-07-01 15:22:36,864 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
2011-07-01 15:22:36,865 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Neither PARALLEL nor default parallelism is set for this job. Setting number of reducers to 1
2011-07-01 15:22:36,926 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2011-07-01 15:22:37,427 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2011-07-01 15:22:38,081 [Thread-5] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 181
2011-07-01 15:22:38,814 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201107011409_0006
2011-07-01 15:22:38,814 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://10.20.1.205:50030/jobdetails.jsp?jobid=job_201107011409_0006
2011-07-01 15:24:35,461 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 4% complete
2011-07-01 15:29:24,091 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 15% complete
2011-07-01 15:40:09,489 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 40% complete
2011-07-01 15:49:44,907 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 63% complete
2011-07-01 15:51:08,813 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
2011-07-01 15:52:24,638 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2011-07-01 15:52:24,640 [main] INFO org.apache.pig.tools.pigstats.PigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
0.20.203.1-brisk1-beta2 0.8.3 prjt 2011-07-01 15:22:34 2011-07-01 15:52:24 GROUP_BY
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime MaxReduceTime MinReduceTime AvgReduceTime Alias Feature Outputs
job_201107011409_0006 181 1 27 12 18 1677 1677 1677 B,X,entities GROUP_BY,COMBINER cfs://null/tmp/temp1522760494/tmp1523872576,
Input(s):
Successfully read 11900289 records from: "cassandra://CassBuilderPRJT/rtbEnts"
Output(s):
Successfully stored 11900289 records in: "cfs://null/tmp/temp1522760494/tmp1523872576"
Counters:
Total records written : 11900289
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_201107011409_0006
2011-07-01 15:52:24,707 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2011-07-01 15:52:24,756 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2011-07-01 15:52:24,756 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2011-07-01 15:52:24,816 [main] ERROR org.apache.pig.backend.hadoop.executionengine.HJob - java.lang.RuntimeException: Local file does not exist: /var/lib/cassandra/data/cfs/sblocks-g-41-Data.db
2011-07-01 15:52:24,817 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. java.lang.RuntimeException: Local file does not exist: /var/lib/cassandra/data/cfs/sblocks-g-41-Data.db
2011-07-01 15:52:24,817 [main] WARN org.apache.pig.tools.grunt.Grunt - There is no log file to write to.
2011-07-01 15:52:24,817 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: java.lang.RuntimeException: Local file does not exist: /var/lib/cassandra/data/cfs/sblocks-g-41-Data.db
at org.apache.pig.backend.hadoop.executionengine.HJob$1.hasNext(HJob.java:118)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:616)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
at org.apache.pig.Main.run(Main.java:455)
at org.apache.pig.Main.main(Main.java:107)
Caused by: java.lang.RuntimeException: Local file does not exist: /var/lib/cassandra/data/cfs/sblocks-g-41-Data.db
at org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.readLocalBlock(CassandraFileSystemThriftStore.java:477)
at org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.retrieveSubBlock(CassandraFileSystemThriftStore.java:398)
at org.apache.cassandra.hadoop.fs.CassandraSubBlockInputStream.subBlockSeekTo(CassandraSubBlockInputStream.java:140)
at org.apache.cassandra.hadoop.fs.CassandraSubBlockInputStream.read(CassandraSubBlockInputStream.java:90)
at org.apache.cassandra.hadoop.fs.CassandraInputStream.read(CassandraInputStream.java:135)
at java.io.DataInputStream.read(DataInputStream.java:132)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at org.apache.pig.impl.io.BufferedPositionedInputStream.read(BufferedPositionedInputStream.java:52)
at org.apache.pig.impl.io.InterRecordReader.nextKeyValue(InterRecordReader.java:86)
at org.apache.pig.impl.io.InterStorage.getNext(InterStorage.java:77)
at org.apache.pig.impl.io.ReadToEndLoader.getNextHelper(ReadToEndLoader.java:209)
at org.apache.pig.impl.io.ReadToEndLoader.getNext(ReadToEndLoader.java:189)
at org.apache.pig.backend.hadoop.executionengine.HJob$1.hasNext(HJob.java:111)
... 7 more
--- missing seeds ?
> brisk hadoop fs -ls
11/07/01 16:03:49 ERROR config.DatabaseDescriptor: Fatal configuration error error
Can't construct a java object for tag:yaml.org,2002:org.apache.cassandra.config.Config; exception=Cannot create property=seeds for JavaBean=org.apache.cassandra.config.Config@5c1428ea; Unable to find property 'seeds' on class: org.apache.cassandra.config.Config
in "<reader>", line 10, column 1:
cluster_name: 'Test Cluster'
^
at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:372)
at org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:177)
at org.yaml.snakeyaml.constructor.BaseConstructor.constructDocument(BaseConstructor.java:136)
at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:122)
at org.yaml.snakeyaml.Loader.load(Loader.java:52)
at org.yaml.snakeyaml.Yaml.load(Yaml.java:166)
at org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:138)
at org.apache.cassandra.utils.FBUtilities.getLocalAddress(FBUtilities.java:121)
at org.apache.cassandra.hadoop.fs.CassandraFileSystemThriftStore.initialize(CassandraFileSystemThriftStore.java:128)
at org.apache.cassandra.hadoop.fs.CassandraFileSystem.initialize(CassandraFileSystem.java:59)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1310)
at org.apache.hadoop.fs.FileSystem.access$100(FileSystem.java:65)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1328)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:226)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:109)
at org.apache.hadoop.fs.FsShell.init(FsShell.java:82)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1745)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1895)
Caused by: org.yaml.snakeyaml.error.YAMLException: Cannot create property=seeds for JavaBean=org.apache.cassandra.config.Config@5c1428ea; Unable to find property 'seeds' on class: org.apache.cassandra.config.Config
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.constructJavaBean2ndStep(Constructor.java:305)
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.construct(Constructor.java:184)
at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:370)
... 19 more
Caused by: org.yaml.snakeyaml.error.YAMLException: Unable to find property 'seeds' on class: org.apache.cassandra.config.Config
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.getProperty(Constructor.java:342)
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.constructJavaBean2ndStep(Constructor.java:240)
... 21 more
null; Can't construct a java object for tag:yaml.org,2002:org.apache.cassandra.config.Config; exception=Cannot create property=seeds for JavaBean=org.apache.cassandra.config.Config@5c1428ea; Unable to find property 'seeds' on class: org.apache.cassandra.config.Config
Invalid yaml; unable to start server. See log for stacktrace.
