How many subcolumns are in each supercolumn and how large are the
values? Your example shows 8 subcolumns, but I didn't know if that was
the actual number. I've been able to read columns out of Cassandra at
an order of magnitude higher than what you're seeing here but there
are too many variables to directly compare.
Keep in mind that the results from each thrift call has to fit into
memory - you might be better off paging through the 23000 columns,
reading a few thousand at a time.
Ben
On Fri, Jun 4, 2010 at 11:01 AM, Per Olesen <pol@trifork.com> wrote:
> On Jun 4, 2010, at 4:46 PM, Jonathan Ellis wrote:
>
>> get_slice reads a single row. do you mean there are 23,000 columns,
>> or are you running get_slice in a loop 23000 times?
>
> Hi Jonathan, thanks for answering!
>
> No, I do only one get_slice call.
>
> There are 23.000 SUPER columns, which I read using get_slice with ColumnParent parameter
set to only CF name (Dashboard) and a SlicePredicate, that has "" for begin on super column
name and "" for end on super column name.
>
> So, I do one single get_slice to get all the super-columns. This is the thrift call,
that takes approx. 6-8 secs.
>
> I then iterate over this after the call, to extract columns for each super-column, but
that is not in my timings and it also performs no thrift calls.
>
> Like this:
>
>>> ColumnParent parent = new ColumnParent("Dashboard");
>>>
>>> SlicePredicate predicate = new SlicePredicate();
>>> SliceRange sliceRange = new SliceRange();
>>> sliceRange.setCount(Integer.MAX_VALUE);
>>> sliceRange.setStart(toRawValue(""));
>>> sliceRange.setFinish(toRawValue(""));
>>> predicate.setSlice_range(sliceRange);
>>>
>>> // timing this takes 6-8 secs.
>>> return client.get_slice(
>>> "keyspace",
>>> "theusername",
>>> columnParent,
>>> slicePredicate,
>>> ConsistencyLevel.QUORUM
>>> );
>
>