sstabledump¶
Dump contents of a given SSTable to standard output in JSON format.
You must supply exactly one sstable.
Cassandra must be stopped before this tool is executed, or unexpected results will occur. Note: the script does not verify that Cassandra is stopped.
Usage¶
sstabledump <options> <sstable file path>
-d | CQL row per line internal representation |
-e | Enumerate partition keys only |
-k <arg> | Partition key |
-x <arg> | Excluded partition key(s) |
-t | Print raw timestamps instead of iso8601 date strings |
-l | Output each row as a separate JSON object |
If necessary, use sstableutil first to find out the sstables used by a table.
Dump entire table¶
Dump the entire table without any options.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db > eventlog_dump_2018Jul26
cat eventlog_dump_2018Jul26
[
{
"partition" : {
"key" : [ "3578d7de-c60d-4599-aefb-3f22a07b2bc6" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 61,
"liveness_info" : { "tstamp" : "2018-07-20T20:23:08.378711Z" },
"cells" : [
{ "name" : "event", "value" : "party" },
{ "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:08.384Z" },
{ "name" : "source", "value" : "asdf" }
]
}
]
},
{
"partition" : {
"key" : [ "d18250c0-84fc-4d40-b957-4248dc9d790e" ],
"position" : 62
},
"rows" : [
{
"type" : "row",
"position" : 123,
"liveness_info" : { "tstamp" : "2018-07-20T20:23:07.783522Z" },
"cells" : [
{ "name" : "event", "value" : "party" },
{ "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:07.789Z" },
{ "name" : "source", "value" : "asdf" }
]
}
]
},
{
"partition" : {
"key" : [ "cf188983-d85b-48d6-9365-25005289beb2" ],
"position" : 124
},
"rows" : [
{
"type" : "row",
"position" : 182,
"liveness_info" : { "tstamp" : "2018-07-20T20:22:27.028809Z" },
"cells" : [
{ "name" : "event", "value" : "party" },
{ "name" : "insertedtimestamp", "value" : "2018-07-20 20:22:27.055Z" },
{ "name" : "source", "value" : "asdf" }
]
}
]
}
]
Dump table in a more manageable format¶
Use the -l option to dump each row as a separate JSON object. This will make the output easier to manipulate for large data sets. ref: https://issues.apache.org/jira/browse/CASSANDRA-13848
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -l > eventlog_dump_2018Jul26_justlines
cat eventlog_dump_2018Jul26_justlines
[
{
"partition" : {
"key" : [ "3578d7de-c60d-4599-aefb-3f22a07b2bc6" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 61,
"liveness_info" : { "tstamp" : "2018-07-20T20:23:08.378711Z" },
"cells" : [
{ "name" : "event", "value" : "party" },
{ "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:08.384Z" },
{ "name" : "source", "value" : "asdf" }
]
}
]
},
{
"partition" : {
"key" : [ "d18250c0-84fc-4d40-b957-4248dc9d790e" ],
"position" : 62
},
"rows" : [
{
"type" : "row",
"position" : 123,
"liveness_info" : { "tstamp" : "2018-07-20T20:23:07.783522Z" },
"cells" : [
{ "name" : "event", "value" : "party" },
{ "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:07.789Z" },
{ "name" : "source", "value" : "asdf" }
]
}
]
},
{
"partition" : {
"key" : [ "cf188983-d85b-48d6-9365-25005289beb2" ],
"position" : 124
},
"rows" : [
{
"type" : "row",
"position" : 182,
"liveness_info" : { "tstamp" : "2018-07-20T20:22:27.028809Z" },
"cells" : [
{ "name" : "event", "value" : "party" },
{ "name" : "insertedtimestamp", "value" : "2018-07-20 20:22:27.055Z" },
{ "name" : "source", "value" : "asdf" }
]
}
]
}
Dump only keys¶
Dump only the keys by using the -e option.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -e > eventlog_dump_2018Jul26_justkeys
cat eventlog_dump_2018Jul26b
[ [ "3578d7de-c60d-4599-aefb-3f22a07b2bc6" ], [ "d18250c0-84fc-4d40-b957-4248dc9d790e" ], [ "cf188983-d85b-48d6-9365-25005289beb2" ]
Dump row for a single key¶
Dump a single key using the -k option.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -k 3578d7de-c60d-4599-aefb-3f22a07b2bc6 > eventlog_dump_2018Jul26_singlekey
cat eventlog_dump_2018Jul26_singlekey
[
{
"partition" : {
"key" : [ "3578d7de-c60d-4599-aefb-3f22a07b2bc6" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 61,
"liveness_info" : { "tstamp" : "2018-07-20T20:23:08.378711Z" },
"cells" : [
{ "name" : "event", "value" : "party" },
{ "name" : "insertedtimestamp", "value" : "2018-07-20 20:23:08.384Z" },
{ "name" : "source", "value" : "asdf" }
]
}
]
}
Exclude a key or keys in dump of rows¶
Dump a table except for the rows excluded with the -x option. Multiple keys can be used.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -x 3578d7de-c60d-4599-aefb-3f22a07b2bc6 d18250c0-84fc-4d40-b957-4248dc9d790e > eventlog_dump_2018Jul26_excludekeys
cat eventlog_dump_2018Jul26_excludekeys
[
{
"partition" : {
"key" : [ "cf188983-d85b-48d6-9365-25005289beb2" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 182,
"liveness_info" : { "tstamp" : "2018-07-20T20:22:27.028809Z" },
"cells" : [
{ "name" : "event", "value" : "party" },
{ "name" : "insertedtimestamp", "value" : "2018-07-20 20:22:27.055Z" },
{ "name" : "source", "value" : "asdf" }
]
}
]
}
Display raw timestamps¶
By default, dates are displayed in iso8601 date format. Using the -t option will dump the data with the raw timestamp.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -t -k cf188983-d85b-48d6-9365-25005289beb2 > eventlog_dump_2018Jul26_times
cat eventlog_dump_2018Jul26_times
[
{
"partition" : {
"key" : [ "cf188983-d85b-48d6-9365-25005289beb2" ],
"position" : 124
},
"rows" : [
{
"type" : "row",
"position" : 182,
"liveness_info" : { "tstamp" : "1532118147028809" },
"cells" : [
{ "name" : "event", "value" : "party" },
{ "name" : "insertedtimestamp", "value" : "2018-07-20 20:22:27.055Z" },
{ "name" : "source", "value" : "asdf" }
]
}
]
}
Display internal structure in output¶
Dump the table in a format that reflects the internal structure.
Example:
sstabledump /var/lib/cassandra/data/keyspace/eventlog-65c429e08c5a11e8939edf4f403979ef/mc-1-big-Data.db -d > eventlog_dump_2018Jul26_d
cat eventlog_dump_2018Jul26_d
[3578d7de-c60d-4599-aefb-3f22a07b2bc6]@0 Row[info=[ts=1532118188378711] ]: | [event=party ts=1532118188378711], [insertedtimestamp=2018-07-20 20:23Z ts=1532118188378711], [source=asdf ts=1532118188378711]
[d18250c0-84fc-4d40-b957-4248dc9d790e]@62 Row[info=[ts=1532118187783522] ]: | [event=party ts=1532118187783522], [insertedtimestamp=2018-07-20 20:23Z ts=1532118187783522], [source=asdf ts=1532118187783522]
[cf188983-d85b-48d6-9365-25005289beb2]@124 Row[info=[ts=1532118147028809] ]: | [event=party ts=1532118147028809], [insertedtimestamp=2018-07-20 20:22Z ts=1532118147028809], [source=asdf ts=1532118147028809]