sstablemetadata¶
Print information about an sstable from the related Statistics.db and Summary.db files to standard output.
ref: https://issues.apache.org/jira/browse/CASSANDRA-7159 and https://issues.apache.org/jira/browse/CASSANDRA-10838
Cassandra must be stopped before this tool is executed, or unexpected results will occur. Note: the script does not verify that Cassandra is stopped.
Usage¶
sstablemetadata <options> <sstable filename(s)>
–gc_grace_seconds <arg> | The gc_grace_seconds to use when calculating droppable tombstones |
Print all the metadata¶
Run sstablemetadata against the *Data.db file(s) related to a table. If necessary, find the *Data.db file(s) using sstableutil.
Example:
sstableutil keyspace1 standard1 | grep Data
/var/lib/cassandra/data/keyspace1/standard1-f6845640a6cb11e8b6836d2c86545d91/mc-1-big-Data.db
sstablemetadata /var/lib/cassandra/data/keyspace1/standard1-f6845640a6cb11e8b6836d2c86545d91/mc-1-big-Data.db
SSTable: /var/lib/cassandra/data/keyspace1/standard1-f6845640a6cb11e8b6836d2c86545d91/mc-1-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.010000
Minimum timestamp: 1535025576141000
Maximum timestamp: 1535025604309000
SSTable min local deletion time: 2147483647
SSTable max local deletion time: 2147483647
Compressor: org.apache.cassandra.io.compress.LZ4Compressor
TTL min: 86400
TTL max: 86400
First token: -9223004712949498654 (key=39373333373831303130)
Last token: 9222554117157811897 (key=4f3438394e39374d3730)
Estimated droppable tombstones: 0.9188263888888889
SSTable Level: 0
Repaired at: 0
Replay positions covered: {CommitLogPosition(segmentId=1535025390651, position=226400)=CommitLogPosition(segmentId=1535025390651, position=6849139)}
totalColumnsSet: 100000
totalRows: 20000
Estimated tombstone drop times:
1535039100: 80390
1535039160: 5645
1535039220: 13965
Count Row Size Cell Count
1 0 0
2 0 0
3 0 0
4 0 0
5 0 20000
6 0 0
7 0 0
8 0 0
10 0 0
12 0 0
14 0 0
17 0 0
20 0 0
24 0 0
29 0 0
35 0 0
42 0 0
50 0 0
60 0 0
72 0 0
86 0 0
103 0 0
124 0 0
149 0 0
179 0 0
215 0 0
258 20000 0
310 0 0
372 0 0
446 0 0
535 0 0
642 0 0
770 0 0
924 0 0
1109 0 0
1331 0 0
1597 0 0
1916 0 0
2299 0 0
2759 0 0
3311 0 0
3973 0 0
4768 0 0
5722 0 0
6866 0 0
8239 0 0
9887 0 0
11864 0 0
14237 0 0
17084 0 0
20501 0 0
24601 0 0
29521 0 0
35425 0 0
42510 0 0
51012 0 0
61214 0 0
73457 0 0
88148 0 0
105778 0 0
126934 0 0
152321 0 0
182785 0 0
219342 0 0
263210 0 0
315852 0 0
379022 0 0
454826 0 0
545791 0 0
654949 0 0
785939 0 0
943127 0 0
1131752 0 0
1358102 0 0
1629722 0 0
1955666 0 0
2346799 0 0
2816159 0 0
3379391 0 0
4055269 0 0
4866323 0 0
5839588 0 0
7007506 0 0
8409007 0 0
10090808 0 0
12108970 0 0
14530764 0 0
17436917 0 0
20924300 0 0
25109160 0 0
30130992 0 0
36157190 0 0
43388628 0 0
52066354 0 0
62479625 0 0
74975550 0 0
89970660 0 0
107964792 0 0
129557750 0 0
155469300 0 0
186563160 0 0
223875792 0 0
268650950 0 0
322381140 0 0
386857368 0 0
464228842 0 0
557074610 0 0
668489532 0 0
802187438 0 0
962624926 0 0
1155149911 0 0
1386179893 0 0
1663415872 0 0
1996099046 0 0
2395318855 0 0
2874382626 0
3449259151 0
4139110981 0
4966933177 0
5960319812 0
7152383774 0
8582860529 0
10299432635 0
12359319162 0
14831182994 0
17797419593 0
21356903512 0
25628284214 0
30753941057 0
36904729268 0
44285675122 0
53142810146 0
63771372175 0
76525646610 0
91830775932 0
110196931118 0
132236317342 0
158683580810 0
190420296972 0
228504356366 0
274205227639 0
329046273167 0
394855527800 0
473826633360 0
568591960032 0
682310352038 0
818772422446 0
982526906935 0
1179032288322 0
1414838745986 0
Estimated cardinality: 20196
EncodingStats minTTL: 0
EncodingStats minLocalDeletionTime: 1442880000
EncodingStats minTimestamp: 1535025565275000
KeyType: org.apache.cassandra.db.marshal.BytesType
ClusteringTypes: [org.apache.cassandra.db.marshal.UTF8Type]
StaticColumns: {C3:org.apache.cassandra.db.marshal.BytesType, C4:org.apache.cassandra.db.marshal.BytesType, C0:org.apache.cassandra.db.marshal.BytesType, C1:org.apache.cassandra.db.marshal.BytesType, C2:org.apache.cassandra.db.marshal.BytesType}
RegularColumns: {}
Specify gc grace seconds¶
To see the ratio of droppable tombstones given a configured gc grace seconds, use the gc_grace_seconds option. Because the sstablemetadata tool doesn’t access the schema directly, this is a way to more accurately estimate droppable tombstones – for example, if you pass in gc_grace_seconds matching what is configured in the schema. The gc_grace_seconds value provided is subtracted from the curent machine time (in seconds).
ref: https://issues.apache.org/jira/browse/CASSANDRA-12208
Example:
sstablemetadata /var/lib/cassandra/data/keyspace1/standard1-41b52700b4ed11e896476d2c86545d91/mc-12-big-Data.db | grep "Estimated tombstone drop times" -A4
Estimated tombstone drop times:
1536599100: 1
1536599640: 1
1536599700: 2
echo $(date +%s)
1536602005
# if gc_grace_seconds was configured at 100, all of the tombstones would be currently droppable
sstablemetadata --gc_grace_seconds 100 /var/lib/cassandra/data/keyspace1/standard1-41b52700b4ed11e896476d2c86545d91/mc-12-big-Data.db | grep "Estimated droppable tombstones"
Estimated droppable tombstones: 4.0E-5
# if gc_grace_seconds was configured at 4700, some of the tombstones would be currently droppable
sstablemetadata --gc_grace_seconds 4700 /var/lib/cassandra/data/keyspace1/standard1-41b52700b4ed11e896476d2c86545d91/mc-12-big-Data.db | grep "Estimated droppable tombstones"
Estimated droppable tombstones: 9.61111111111111E-6
# if gc_grace_seconds was configured at 100, none of the tombstones would be currently droppable
sstablemetadata --gc_grace_seconds 5000 /var/lib/cassandra/data/keyspace1/standard1-41b52700b4ed11e896476d2c86545d91/mc-12-big-Data.db | grep "Estimated droppable tombstones"
Estimated droppable tombstones: 0.0
Explanation of each value printed above¶
Value | Explanation |
---|---|
SSTable | prefix of the sstable filenames related to this sstable |
Partitioner | partitioner type used to distribute data across nodes; defined in cassandra.yaml |
Bloom Filter FP | precision of Bloom filter used in reads; defined in the table definition |
Minimum timestamp | minimum timestamp of any entry in this sstable, in epoch microseconds |
Maximum timestamp | maximum timestamp of any entry in this sstable, in epoch microseconds |
SSTable min local deletion time | minimum timestamp of deletion date, based on TTL, in epoch seconds |
SSTable max local deletion time | maximum timestamp of deletion date, based on TTL, in epoch seconds |
Compressor | blank (-) by default; if not blank, indicates type of compression enabled on the table |
TTL min | time-to-live in seconds; default 0 unless defined in the table definition |
TTL max | time-to-live in seconds; default 0 unless defined in the table definition |
First token | lowest token and related key found in the sstable summary |
Last token | highest token and related key found in the sstable summary |
Estimated droppable tombstones | ratio of tombstones to columns, using configured gc grace seconds if relevant |
SSTable level | compaction level of this sstable, if leveled compaction (LCS) is used |
Repaired at | the timestamp this sstable was marked as repaired via sstablerepairedset, in epoch milliseconds |
Replay positions covered | the interval of time and commitlog positions related to this sstable |
totalColumnsSet | number of cells in the table |
totalRows | number of rows in the table |
Estimated tombstone drop times | approximate number of rows that will expire, ordered by epoch seconds |
Count Row Size Cell Count | two histograms in two columns; one represents distribution of Row Size and the other represents distribution of Cell Count |
Estimated cardinality | an estimate of unique values, used for compaction |
EncodingStats* minTTL | in epoch milliseconds |
EncodingStats* minLocalDeletionTime | in epoch seconds |
EncodingStats* minTimestamp | in epoch microseconds |
KeyType | the type of partition key, useful in reading and writing data from/to storage; defined in the table definition |
ClusteringTypes | the type of clustering key, useful in reading and writing data from/to storage; defined in the table definition |
StaticColumns | a list of the shared columns in the table |
RegularColumns | a list of non-static, non-key columns in the table |
- For the encoding stats values, the delta of this and the current epoch time is used when encoding and storing data in the most optimal way.