2.6. jumpavg¶
2.6.1. AvgStdevStats suite¶
Module holding AvgStdevStats class.
-
class
resources.libraries.python.jumpavg.AvgStdevStats.
AvgStdevStats
(size=0, avg=0.0, stdev=0.0)¶ Bases:
object
Class for statistics which include average and stdev of a group.
Contrary to other stats types, adding values to the group is computationally light without any caching.
Instances are only statistics, the data itself is stored elsewhere.
-
classmethod
for_runs
(runs)¶ Return new stats instance describing the sequence of runs.
If you want to append data to existing stats object, you can simply use the stats object as the first run.
Instead of a verb, “for” is used to start this method name, to signify the result contains less information than the input data.
Here, Run is a hypothetical abstract class, an union of float and cls. Defining that as a real abstract class in Python 2 is too much hassle.
- Parameters
runs (Iterable[Union[float, cls]]) – Sequence of data to describe by the new metadata.
- Returns
The new stats instance.
- Return type
cls
-
classmethod
2.6.2. BitCountingGroup suite¶
Module holding BitCountingGroup class.
-
class
resources.libraries.python.jumpavg.BitCountingGroup.
BitCountingGroup
(run_list=None, stats=None, bits=None, max_value=None, prev_avg=None, comment='unknown')¶ Bases:
object
Group of runs which tracks bit count in an efficient manner.
This class contains methods that mutate the internal state, use copy() method to save the previous state.
The Sequence-like access is related to the list of runs, for example group[0] returns the first run in the list. Writable list-like methods are not implemented.
As the group bit count depends on previous average and overall maximal value, those values are assumed to be known beforehand (and immutable).
As the caller is allowed to divide runs into groups in any way, a method to add a single run in an efficient manner is provided.
-
append
(run)¶ Mutate to add the new run, return self.
Stats are updated, but old bits value is deleted from cache.
- Parameters
run – The run value to add to the group.
- Returns
The updated self.
- Return type
-
property
bits
¶ Return overall bit content of the group list.
If not cached, compute from stats and cache.
- Returns
The overall information content in bits.
- Return type
float
-
copy
()¶ Return a new instance with copied internal state.
- Returns
The copied instance.
- Return type
-
extend
(runs)¶ Mutate to add the new runs, return self.
This is saves small amount of computation compared to adding runs one by one in a loop.
Stats are updated, but old bits value is deleted from cache.
- Parameters
runs – The runs to add to the group.
- Returns
The updated self.
- Return type
-
2.6.3. BitCountingGroupList suite¶
Module holding BitCountingGroupList class.
-
class
resources.libraries.python.jumpavg.BitCountingGroupList.
BitCountingGroupList
(group_list=None, bits_except_last=0.0, max_value=None)¶ Bases:
object
List of data groups which tracks overall bit count.
The Sequence-like access is related to the list of groups, for example group_list[0] returns the first group in the list. Writable list-like methods are not implemented.
The overall bit count is the sum of bit counts of each group. Group is a sequence of data samples accompanied by their stats. Different partitioning of data samples into the groups results in different overall bit count. This can be used to group samples in various contexts.
As the group bit count depends on previous average and overall maximal value, order of groups is important. Having the logic encapsulated here spares the caller the effort to pass averages around.
The data can be only added, and there is some logic to skip recalculations if the bit count is not needed.
-
append_group_of_runs
(runs)¶ Mutate to add a new group based on the runs, return self.
The argument is copied before adding to the group list, so further edits do not affect the grup list. The argument can also be a group, only runs from it are used.
- Parameters
runs (Union[Iterable[Run], BitCountingGroup]) – Runs to form the next group to be appended to self.
- Returns
The updated self.
- Return type
-
append_run_to_to_last_group
(run)¶ Mutate to add new run at the end of the last group.
Basically a one-liner, only returning group list instead of last group.
- Parameters
run (Run) – The run value to add to the last group.
- Returns
The updated self.
- Return type
- Raises
IndexError – If group list is empty, no last group to add to.
-
property
bits
¶ Return overall bit content of the group list.
- Returns
The overall information content in bits.
- Return type
float
-
copy
()¶ Return a new instance with copied internal state.
- Returns
The copied instance.
- Return type
-
extend_runs_to_last_group
(runs)¶ Mutate to add new runs to the end of the last group.
A faster alternative to appending runs one by one in a loop.
- Parameters
runs (Iterable[Run]) – The runs to add to the last group.
- Returns
The updated self
- Return type
- Raises
IndexError – If group list is empty, no last group to add to.
-
2.6.4. BitCountingStats suite¶
Module holding BitCountingStats class.
-
class
resources.libraries.python.jumpavg.BitCountingStats.
BitCountingStats
(size=0, avg=None, stdev=0.0, max_value=None, prev_avg=None)¶ Bases:
resources.libraries.python.jumpavg.AvgStdevStats.AvgStdevStats
Class for statistics which include information content of a group.
The information content is based on an assumption that the data consists of independent random values from a normal distribution.
Instances are only statistics, the data itself is stored elsewhere.
The coding needs to know the previous average, and a maximal value so both values are required as inputs.
This is a subclass of AvgStdevStats, even though all methods are overriden. Only for_runs method calls the parent implementation, without using super().
-
classmethod
for_runs
(runs, max_value=None, prev_avg=None)¶ Return new stats instance describing the sequence of runs.
If you want to append data to existing stats object, you can simply use the stats object as the first run.
Instead of a verb, “for” is used to start this method name, to signify the result contains less information than the input data.
The two optional values can come from outside of the runs provided.
The max_value cannot be None for non-zero size data. The implementation does not check if no datapoint exceeds max_value.
TODO: Document the behavior for zero size result.
- Parameters
runs (Iterable[Union[float, AvgStdevStats]]) – Sequence of data to describe by the new metadata.
max_value (Union[float, NoneType]) – Maximal expected value.
prev_avg (Union[float, NoneType]) – Population average of the previous group, if any.
- Returns
The new stats instance.
- Return type
cls
-
classmethod
2.6.5. classify suite¶
Module holding the classify function
Classification os one of primary purposes of this package.
Minimal message length principle is used for grouping results into the list of groups, assuming each group is a population of different Gaussian distribution.
-
resources.libraries.python.jumpavg.classify.
classify
(values)¶ Return the values in groups of optimal bit count.
Here, a value is either a float, or an iterable of floats. Such iterables represent an undivisible sequence of floats.
Internally, such sequence is replaced by AvgStdevStats after maximal value is found.
- Parameters
values (Iterable[Union[float, Iterable[float]]]) – Sequence of runs to classify.
- Returns
Classified group list.
- Return type