Hi there, we are just switching to CQL for some of our data analysis (we were using direct thrift queries) and I have a question which I'm guessing must have come up before. I have a large amount of user data in a super column so I have written a (PHP) script to read that data and push it into a wide column suitable for CQL.
At the moment I have a loop that looks something like this:
Read a chunk of keys from the super-column like so
$cCQLResult = $this->ExecuteCQL($cCQLDatasource, "SELECT KEY FROM PlayrUserData WHERE KEY >= '$StartKey' AND KEY < '$FinishKey'");
Now for that chunk build a list of keys that I wish to write (using a thrift query into the super column)
For those keys that pass a simple test write them all into a wide column family:
$this->ExecuteCQL($cCQLDatasource, "UPDATE MarketingCohortCache SET '000001.Facebook.1517061723'='1','000020.Facebook.100003474181964'='1', .. 'WHERE KEY=LapsedUser_1330560000");
The problem appears to be that I am attempting to UPDATE MarketingCohortCache with a pretty long list (about 5000 keys) and it appears to be choking the thrift interface with: TSocket: Could not write 194718 bytes (the length of the Update string).
Is there a recommended way to write a large quantity of data to a wide column? Do I need to loop through with multiple updates? If so how long can I let the update string get?
May thanks for reading.