Abstract
Systems and methods for estimating data reduction ratio for a data set is provided. The method comprises selecting a plurality of m elements from a data set comprising a plurality of N elements; associating an identifier hi for each of the plurality of m elements; associating an identifier he for each of the plurality of elements in the data set; tracking number of times an element i appears in a base set that includes the plurality of m elements selected from the data set; calculating a value counti that indicates the number of times an identifier he matches an identifier hi; and estimating data reduction ratio for the plurality of N elements in the data set, based on number of m number elements selected from the data set and the value counti.
Original language | English |
---|---|
Patent number | US2014052699 |
IPC | G06F 17/ 30 A I |
Priority date | 20/08/12 |
State | Published - 20 Feb 2014 |