2.6. jumpavg

2.6.1. AvgStdevStats suite

Module holding AvgStdevStats class.

class resources.libraries.python.jumpavg.AvgStdevStats.AvgStdevStats(size=0, avg=0.0, stdev=0.0)

Bases: object

Class for statistics which include average and stdev of a group.

Contrary to other stats types, adding values to the group is computationally light without any caching.

Instances are only statistics, the data itself is stored elsewhere.

classmethod for_runs(runs)

Return new stats instance describing the sequence of runs.

If you want to append data to existing stats object, you can simply use the stats object as the first run.

Instead of a verb, “for” is used to start this method name, to signify the result contains less information than the input data.

Here, Run is a hypothetical abstract class, an union of float and cls. Defining that as a real abstract class in Python 2 is too much hassle.

Parameters

runs (Iterable[Union[float, cls]]) – Sequence of data to describe by the new metadata.

Returns

The new stats instance.

Return type

cls

2.6.2. BitCountingGroup suite

Module holding BitCountingGroup class.

class resources.libraries.python.jumpavg.BitCountingGroup.BitCountingGroup(run_list=None, stats=None, bits=None, max_value=None, prev_avg=None, comment='unknown')

Bases: object

Group of runs which tracks bit count in an efficient manner.

This class contains methods that mutate the internal state, use copy() method to save the previous state.

The Sequence-like access is related to the list of runs, for example group[0] returns the first run in the list. Writable list-like methods are not implemented.

As the group bit count depends on previous average and overall maximal value, those values are assumed to be known beforehand (and immutable).

As the caller is allowed to divide runs into groups in any way, a method to add a single run in an efficient manner is provided.

append(run)

Mutate to add the new run, return self.

Stats are updated, but old bits value is deleted from cache.

Parameters

run – The run value to add to the group.

Returns

The updated self.

Return type

BitCountingGroup

property bits

Return overall bit content of the group list.

If not cached, compute from stats and cache.

Returns

The overall information content in bits.

Return type

float

copy()

Return a new instance with copied internal state.

Returns

The copied instance.

Return type

BitCountingGroup

extend(runs)

Mutate to add the new runs, return self.

This is saves small amount of computation compared to adding runs one by one in a loop.

Stats are updated, but old bits value is deleted from cache.

Parameters

runs – The runs to add to the group.

Returns

The updated self.

Return type

BitCountingGroup

2.6.3. BitCountingGroupList suite

Module holding BitCountingGroupList class.

class resources.libraries.python.jumpavg.BitCountingGroupList.BitCountingGroupList(group_list=None, bits_except_last=0.0, max_value=None)

Bases: object

List of data groups which tracks overall bit count.

The Sequence-like access is related to the list of groups, for example group_list[0] returns the first group in the list. Writable list-like methods are not implemented.

The overall bit count is the sum of bit counts of each group. Group is a sequence of data samples accompanied by their stats. Different partitioning of data samples into the groups results in different overall bit count. This can be used to group samples in various contexts.

As the group bit count depends on previous average and overall maximal value, order of groups is important. Having the logic encapsulated here spares the caller the effort to pass averages around.

The data can be only added, and there is some logic to skip recalculations if the bit count is not needed.

append_group_of_runs(runs)

Mutate to add a new group based on the runs, return self.

The argument is copied before adding to the group list, so further edits do not affect the grup list. The argument can also be a group, only runs from it are used.

Parameters

runs (Union[Iterable[Run], BitCountingGroup]) – Runs to form the next group to be appended to self.

Returns

The updated self.

Return type

BitCountingGroupList

append_run_to_to_last_group(run)

Mutate to add new run at the end of the last group.

Basically a one-liner, only returning group list instead of last group.

Parameters

run (Run) – The run value to add to the last group.

Returns

The updated self.

Return type

BitCountingGroupList

Raises

IndexError – If group list is empty, no last group to add to.

property bits

Return overall bit content of the group list.

Returns

The overall information content in bits.

Return type

float

copy()

Return a new instance with copied internal state.

Returns

The copied instance.

Return type

BitCountingGroupList

extend_runs_to_last_group(runs)

Mutate to add new runs to the end of the last group.

A faster alternative to appending runs one by one in a loop.

Parameters

runs (Iterable[Run]) – The runs to add to the last group.

Returns

The updated self

Return type

BitCountingGroupList

Raises

IndexError – If group list is empty, no last group to add to.

2.6.4. BitCountingStats suite

Module holding BitCountingStats class.

class resources.libraries.python.jumpavg.BitCountingStats.BitCountingStats(size=0, avg=None, stdev=0.0, max_value=None, prev_avg=None)

Bases: resources.libraries.python.jumpavg.AvgStdevStats.AvgStdevStats

Class for statistics which include information content of a group.

The information content is based on an assumption that the data consists of independent random values from a normal distribution.

Instances are only statistics, the data itself is stored elsewhere.

The coding needs to know the previous average, and a maximal value so both values are required as inputs.

This is a subclass of AvgStdevStats, even though all methods are overriden. Only for_runs method calls the parent implementation, without using super().

classmethod for_runs(runs, max_value=None, prev_avg=None)

Return new stats instance describing the sequence of runs.

If you want to append data to existing stats object, you can simply use the stats object as the first run.

Instead of a verb, “for” is used to start this method name, to signify the result contains less information than the input data.

The two optional values can come from outside of the runs provided.

The max_value cannot be None for non-zero size data. The implementation does not check if no datapoint exceeds max_value.

TODO: Document the behavior for zero size result.

Parameters
  • runs (Iterable[Union[float, AvgStdevStats]]) – Sequence of data to describe by the new metadata.

  • max_value (Union[float, NoneType]) – Maximal expected value.

  • prev_avg (Union[float, NoneType]) – Population average of the previous group, if any.

Returns

The new stats instance.

Return type

cls

2.6.5. classify suite

Module holding the classify function

Classification os one of primary purposes of this package.

Minimal message length principle is used for grouping results into the list of groups, assuming each group is a population of different Gaussian distribution.

resources.libraries.python.jumpavg.classify.classify(values)

Return the values in groups of optimal bit count.

Here, a value is either a float, or an iterable of floats. Such iterables represent an undivisible sequence of floats.

Internally, such sequence is replaced by AvgStdevStats after maximal value is found.

Parameters

values (Iterable[Union[float, Iterable[float]]]) – Sequence of runs to classify.

Returns

Classified group list.

Return type

BitCountingGroupList