2.4. PLRsearch

2.4.1. Integrator suite

Module for numerical integration, tightly coupled to PLRsearch algorithm.

See log_plus for an explanation why None acts as a special case “float” number.

TODO: Separate optimizations specific to PLRsearch and distribute the rest

as a standalone package so other projects may reuse.

resources.libraries.python.PLRsearch.Integrator.estimate_nd(communication_pipe, scale_coeff=8.0, trace_enabled=False)

Use Bayesian inference from control queue, put result to result queue.

TODO: Use a logging framework that works in a user friendly way. (Note that multiprocessing_logging does not work well with robot and robotbackgroundlogger only works for threads, not processes. Or, wait for https://github.com/robotframework/robotframework/pull/2182 Anyway, the current implementation with trace_enabled looks ugly.)

The result is average and standard deviation for posterior distribution of a single dependent (scalar, float) value. The prior is assumed to be uniform on (-1, 1) for every parameter. Number of parameters and the function for computing the dependent value and likelihood both come from input.

The likelihood is assumed to be extremely uneven (but never zero), so the function should return the logarithm of the likelihood. The integration method is basically a Monte Carlo (TODO: Add links to notions used here.), but importance sampling is used in order to focus on the part of parameter space with (relatively) non-negligible likelihood.

Multivariate Gauss distribution is used for focusing, so only unimodal posterior distributions are handled correctly. Initial samples are mostly used for shaping (and shifting) the Gaussian distribution, later samples will probably dominate. Thus, initially the algorithm behavior resembles more “find the maximum”, as opposed to “reliably integrate”. As for later iterations of PLRsearch, it is assumed that the distribution position does not change rapidly; thus integration algorithm returns also the distribution data, to be used as initial focus in next iteration.

There are workarounds in place that allow old or default focus tracker to be updated reasonably, even when initial samples of new iteration have way smaller (or larger) weights.

During the “find the maximum” phase, the focus tracker frequently takes a wrong shape (compared to observed samples in equilibrium). Therefore scale_coeff argument is left for humans to tweak, so the convergence is reliable and quick.

Until the distribution locates itself roughly around the maximum likeligood point, the integration results are probably wrong. That means some minimal time is needed for the result to become reliable.

TODO: The folowing is not currently implemented. The reported standard distribution attempts to signal inconsistence (when one sample has dominating weight compared to the rest of samples), but some human supervision is strongly encouraged.

To facilitate running in worker processes, arguments and results are communicated via a pipe. The computation does not start until arguments appear in the pipe, the computation stops when another item (stop object) is detected in the pipe (and result is put to pipe).

TODO: Create classes for arguments and results,

so their fields are documented (and code perhaps more readable).

Input/argument object (received from pipe) is a 4-tuple of the following fields: - dimension: Integer, number of parameters to consider. - dilled_function: Function (serialized using dill), which: - - Takes the dimension number of float parameters from (-1, 1). - - Returns float 2-tuple of dependent value and parameter log-likelihood. - param_focus_tracker: VectorStatTracker to use for initial focus. - max_samples: None or a limit for samples to use.

Output/result object (sent to pipe queue) is a 5-tuple of the following fields: - value_tracker: ScalarDualStatTracker estimate of value posterior. - param_focus_tracker: VectorStatTracker to use for initial focus next. - debug_list: List of debug strings to log at main process. - trace_list: List of trace strings to pass to main process if enabled. - samples: Number of samples used in computation (to make it reproducible). Trace strings are very verbose, it is not recommended to enable them. In they are not enabled, trace_list will be empty. It is recommended to edit some lines manually to debug_list if needed.

Parameters
  • communication_pipe (multiprocessing.Connection) – Endpoint for communication with parent process.

  • scale_coeff (float) – Float number to tweak convergence speed with.

  • trace_enabled (bool) – Whether trace list should be populated at all.

Raises
  • OverflowError – If one sample dominates the rest too much. Or if value_logweight_function does not handle some part of parameter space carefully enough.

  • numpy.linalg.LinAlgError – If the focus shape gets singular (due to rounding errors). Try changing scale_coeff.

resources.libraries.python.PLRsearch.Integrator.generate_sample(averages, covariance_matrix, dimension, scale_coeff)

Generate next sample for estimate_nd.

Arguments control the multivariate normal “focus”. Keep generating until the sample point fits into unit area.

Parameters
  • averages (Indexable of N floats) – Coordinates of the focus center.

  • covariance_matrix (Indexable of N indexables of N floats) – Matrix controlling the spread around the average.

  • dimension (int) – If N is dimension, average is N vector and matrix is NxN.

  • scale_coeff (float) – Coefficient to conformally multiply the spread.

Returns

The generated sample point.

Return type

N-tuple of float

resources.libraries.python.PLRsearch.Integrator.try_estimate_nd(communication_pipe, scale_coeff=8.0, trace_enabled=False)

Call estimate_nd but catch any exception and send traceback.

This function does not return anything, computation result is sent via the communication pipe instead.

TODO: Move scale_coeff to a field of data class with constructor/factory hiding the default value, and receive its instance via pipe, instead of argument.

Parameters
  • communication_pipe (multiprocessing.Connection) – Endpoint for communication with parent process.

  • scale_coeff (float) – Float number to tweak convergence speed with.

  • trace_enabled (bool) – Whether to emit trace level debugs. Keeping trace disabled improves speed and saves memory. Enable trace only when debugging the computation itself.

Raises

BaseException – Anything raised by interpreter or estimate_nd.

2.4.2. PLRsearch suite

Module holding PLRsearch class.

class resources.libraries.python.PLRsearch.PLRsearch.PLRsearch(measurer, trial_duration_per_trial, packet_loss_ratio_target, trial_number_offset=0, timeout=7200.0, trace_enabled=False)

Bases: object

A class to encapsulate data relevant for the search method.

The context is performance testing of packet processing systems. The system, when being offered a steady stream of packets, can process some of them successfully, other are considered “lost”.

See docstring of the search method for algorithm description.

Two constants are stored as class fields for speed.

Method other than search (and than __init__) are just internal code structure.

TODO: Those method names should start with underscore then.

static find_critical_rate(trace, lfit_func, min_rate, max_rate, loss_ratio_target, mrr, spread)

Given ratio target and parameters, return the achieving offered load.

This is basically an inverse function to lfit_func when parameters are fixed. Instead of implementing effective implementation of the inverse function, this implementation uses brute force binary search. It is bisecting (nim_rate, max_rate) interval until the critical load is found (or interval becomes degenerate). This implementation assures min and max rate limits are honored.

TODO: Use some method with faster convergence?

Parameters
  • trace (function (str, object) -> None) – A multiprocessing-friendly logging function (closure).

  • lfit_func (Function from 3 floats to float.) – Fitting function, typically lfit_spread or lfit_erf.

  • min_rate (float) – Lower bound for binary search [pps].

  • max_rate (float) – Upper bound for binary search [pps].

  • loss_ratio_target (float) – Fitting function should return loss rate giving this ratio at the returned load and parameters [1].

  • mrr (float) – The mrr parameter for the fitting function [pps].

  • spread (float) – The spread parameter for the fittinmg function [pps].

Returns

Load [pps] which achieves the target with given parameters.

Return type

float

static lfit_erf(trace, load, mrr, spread)

Erf-based fitting function.

Return the logarithm of average packet loss per second when the load (argument) is offered to a system with given mrr and spread (parameters). Erf function is Primitive function to normal distribution density. The average itself is definite integral from zero to load, of shifted and x-scaled erf function. As the integrator is sensitive to discontinuities, and it calls this function at large areas of parameter space, the implementation has to avoid rounding errors, overflows, and correctly approximate underflows.

TODO: Explain how the high-level description has been converted into an implementation full of ifs.

Parameters
  • trace (function (str, object) -> NoneType) – A multiprocessing-friendly logging function (closure).

  • load (float) – Offered load (positive), in packets per second.

  • mrr (float) – Parameter of this fitting function, equal to limiting (positive) average number of packets received (as opposed to lost) when offered load is many spreads more than mrr.

  • spread (float) – The x-scaling parameter (positive). No nice semantics, roughly corresponds to size of “tail” for loads below mrr.

Returns

Logarithm of average number of packets lost per second.

Return type

float

static lfit_stretch(trace, load, mrr, spread)

Stretch-based fitting function.

Return the logarithm of average packet loss per second when the load (argument) is offered to a system with given mrr and spread (parameters). Stretch function is 1/(1+Exp[-x]). The average itself is definite integral from zero to load, of shifted and x-scaled stretch function. As the integrator is sensitive to discontinuities, and it calls this function at large areas of parameter space, the implementation has to avoid rounding errors, overflows, and correctly approximate underflows.

TODO: Explain how the high-level description has been converted into an implementation full of ifs.

Parameters
  • trace (function (str, object) -> NoneType) – A multiprocessing-friendly logging function (closure).

  • load (float) – Offered load (positive), in packets per second.

  • mrr (float) – Parameter of this fitting function, equal to limiting (positive) average number of packets received (as opposed to lost) when offered load is many spreads more than mrr.

  • spread (float) – The x-scaling parameter (positive). No nice semantics, roughly corresponds to size of “tail” for loads below mrr.

Returns

Logarithm of average number of packets lost per second.

Return type

float

static log_weight(trace, lfit_func, trial_result_list, mrr, spread)

Return log of weight of trial results by the function and parameters.

Integrator assumes uniform distribution, but over different parameters. Weight and likelihood are used interchangeably here anyway.

Each trial has an intended load, a sent count and a loss count (probably counting unsent packets as loss, as they signal the load is too high for the traffic generator). The fitting function is used to compute the average loss rate. Geometric distribution (with average loss per trial) is used to get likelihood of one trial result, the overal likelihood is a product of all trial likelihoods. As likelihoods can be extremely small, logarithms are tracked instead.

The current implementation does not use direct loss rate from the fitting function, as the input and output units may not match (e.g. intended load in TCP transactions, loss in packets). Instead, the expected average loss is scaled according to the number of packets actually sent.

TODO: Copy ReceiveRateMeasurement from MLRsearch.

Parameters
  • trace (function (str, object) -> None) – A multiprocessing-friendly logging function (closure).

  • lfit_func (Function from 3 floats to float.) – Fitting function, typically lfit_spread or lfit_erf.

  • trial_result_list (list of MLRsearch.ReceiveRateMeasurement) – List of trial measurement results.

  • mrr (float) – The mrr parameter for the fitting function.

  • spread (float) – The spread parameter for the fitting function.

Returns

Logarithm of result weight for given function and parameters.

Return type

float

log_xerfcx_10 = -1.45373852745621
measure_and_compute(trial_duration, transmit_rate, trial_result_list, min_rate, max_rate, focus_trackers=(None, None), max_samples=None)

Perform both measurement and computation at once.

High level steps: Prepare and launch computation worker processes, perform the measurement, stop computation and combine results.

Integrator needs a specific function to process (-1, 1) parameters. As our fitting functions use dimensional parameters, so a transformation is performed, resulting in a specific prior distribution over the dimensional parameters. Maximal rate (line rate) is needed for that transformation.

Two fitting functions are used, computation is started on temporary worker process per fitting function. After the measurement, average and stdev of the critical rate (not log) of each worker are combined and returned. Raw averages are also returned, offered load for next iteration is chosen based on them. The idea is that one fitting function might be fitting much better, measurements at its avg are best for relevant results (for both), but we do not know which fitting function it is.

Focus trackers are updated in-place. If a focus tracker in None, new instance is created.

TODO: Define class for result object, so that fields are documented. TODO: Re-use processes, instead creating on each computation? TODO: As only one result is needed fresh, figure out a way how to keep the other worker running. This will alow shorter duration per trial. Special handling at first and last measurement will be needed (to properly initialize and to properly combine results).

Parameters
  • trial_duration (float) – Length of the measurement in seconds.

  • transmit_rate (float) – Offered load in packets per second.

  • trial_result_list (list of MLRsearch.ReceiveRateMeasurement) – Results of previous measurements.

  • min_rate (float) – Practical minimum of possible ofered load.

  • max_rate (float) – Practical maximum of possible ofered load.

  • focus_trackers (2-tuple of None or stat_trackers.VectorStatTracker) – Pair of trackers initialized to speed up the numeric computation.

  • max_samples (None or int) – Limit for integrator samples, for debugging.

Returns

Measurement and computation results.

Return type

_ComputeResult

search(min_rate, max_rate)

Perform the search, return average and stdev for throughput estimate.

Considering measurer and packet_loss_ratio_target (see __init__), find such an offered load (called critical load) that is expected to hit the target loss ratio in the limit of very long trial duration. As the system is probabilistic (and test duration is finite), the critical ratio is only estimated. Return the average and standard deviation of the estimate.

In principle, this algorithm performs trial measurements, each with varied offered load (which is constant during the trial). During each measurement, Bayesian inference is performed on all the measurement results so far. When timeout is up, the last estimate is returned, else another trial is performed.

It is assumed that the system under test, even though not deterministic, still follows the rule of large numbers. In another words, any growing set of measurements at a particular offered load will converge towards unique (for the given load) packet loss ratio. This means there is a deterministic (but unknown) function mapping the offered load to average loss ratio. This function is called loss ratio function. This also assumes the average loss ratio does not depend on trial duration.

The actual probability distribution of loss counts, achieving the average ratio on trials of various duration can be complicated (and can depend on offered load), but simply assuming Poisson distribution will make the algorithm converge. Binomial distribution would be more precise, but Poisson is more practical, as it effectively gives less information content to high ratio results.

Even when applying other assumptions on the loss ratio function (increasing function, limit zero ratio when load goes to zero, global upper limit on rate of packets processed), there are still too many different shapes of possible loss functions, which makes full Bayesian reasoning intractable.

This implementation radically simplifies things by examining only two shapes, each with finitely many (in this case just two) parameters. In other words, two fitting functions (each with two parameters and one argument). When restricting model space to one of the two fitting functions, the Bayesian inference becomes tractable (even though it needs numerical integration from Integrator class).

The first measurement is done at the middle between min_rate and max_rate, to help with convergence if max_rate measurements give loss below target. TODO: Fix overflow error and use min_rate instead of the middle.

The second measurement is done at max_rate, next few measurements have offered load of previous load minus excess loss rate. This simple rule is found to be good when offered loads so far are way above the critical rate. After few measurements, inference from fitting functions converges faster that this initial “optimistic” procedure.

Offered loads close to (limiting) critical rate are the most useful, as linear approximation of the fitting function becomes good enough there (thus reducing the impact of the overall shape of fitting function). After several trials, usually one of the fitting functions has better predictions than the other one, but the algorithm does not track that. Simply, it uses the estimate average, alternating between the functions. Multiple workarounds are applied to try and avoid measurements both in zero loss region and in big loss region, as their results tend to make the critical load estimate worse.

The returned average and stdev is a combination of the two fitting estimates.

Parameters
  • min_rate (float) – Avoid measuring at offered loads below this, in packets per second.

  • max_rate (float) – Avoid measuring at offered loads above this, in packets per second.

Returns

Average and stdev of critical load estimate.

Return type

2-tuple of float

xerfcx_limit = 0.7978845608028654

2.4.3. log_plus suite

Module holding functions for avoiding rounding underflows.

Some applications wish to manipulate non-negative numbers which are many orders of magnitude apart. In those circumstances, it is useful to store the logarithm of the intended number.

As math.log(0.0) raises an exception (instead returning -inf or nan), and 0.0 might be a result of computation caused only by rounding error, functions of this module use None as -inf.

TODO: Figure out a more performant way of handling -inf.

The functions handle the common task of adding or subtracting two numbers where both operands and the result is given in logarithm form. There are conditionals to make sure overflow does not happen (if possible) during the computation.

resources.libraries.python.PLRsearch.log_plus.log_minus(first, second)

Return logarithm of the difference of two exponents.

Basically math.log(math.exp(first) - math.exp(second)) which avoids overflow and uses None as math.log(0.0).

TODO: Support zero difference? TODO: replace with scipy.special.logsumexp ? Test it.

Parameters
  • first (float) – Logarithm of the number to subtract from (or None if zero).

  • second (float) – Logarithm of the number to subtract (or None if zero).

Returns

Logarithm of the difference.

Return type

float

Raises

RuntimeError – If the difference would be non-positive.

resources.libraries.python.PLRsearch.log_plus.log_plus(first, second)

Return logarithm of the sum of two exponents.

Basically math.log(math.exp(first) + math.exp(second)) which avoids overflow and uses None as math.log(0.0).

TODO: replace with scipy.special.logsumexp ? Test it.

Parameters
  • first (float) – Logarithm of the first number to add (or None if zero).

  • second (float) – Logarithm of the second number to add (or None if zero).

Returns

Logarithm of the sum (or None if zero).

Return type

float

resources.libraries.python.PLRsearch.log_plus.safe_exp(log_value)

Return exponential of the argument, or zero if the argument is None.

Parameters

log_value (NoneType or float) – The value to exponentiate.

Returns

The exponentiated value.

Return type

float

2.4.4. stat_trackers suite

Module for tracking average and variance for data of weighted samples.

See log_plus for an explanation why None acts as a special case “float” number.

TODO: Current implementation sets zero averages on empty tracker. Should we set None instead?

TODO: Implement __str__ and __repr__ for easier debugging.

class resources.libraries.python.PLRsearch.stat_trackers.ScalarDualStatTracker(log_sum_weight=None, average=0.0, log_variance=None, log_sum_secondary_weight=None, secondary_average=0.0, log_secondary_variance=None, max_log_weight=None)

Bases: resources.libraries.python.PLRsearch.stat_trackers.ScalarStatTracker

Class for tracking one-dimensional samples, offering dual stats.

It means that instead of just primary stats (identical to what ScalarStatTracker offers, that is why this is its subclass), this tracker allso offers secondary stats. Secondary stats are scalar stats that track the same data, except that the most weighty (or later if tied) sample is not considered.

Users can analyze how do secondary stats differ from the primary ones, and based on that decide whether they processed “enough” data. One typical use is for Monte Carlo integrator to decide whether the partial sums so far are reliable enough.

add(scalar_value, log_weight=0.0)

Return updated both stats after addition of another sample.

Parameters
  • scalar_value (float) – The scalar value of the sample.

  • log_weight (float) – Natural logarithm of weight of the sample. Default: 0.0 (as log of 1.0).

Returns

Updated self.

Return type

ScalarDualStatTracker

get_pessimistic_variance()

Return estimate of variance reflecting weight effects.

Typical scenario is the primary tracker dominated by a single sample. In worse case, secondary tracker is also dominated by a single (but different) sample.

Current implementation simply returns variance of average of the two trackers, as if they were independent.

Returns

Pessimistic estimate of variance (not stdev, no log).

Return type

float

class resources.libraries.python.PLRsearch.stat_trackers.ScalarStatTracker(log_sum_weight=None, average=0.0, log_variance=None)

Bases: object

Class for tracking one-dimensional samples.

Variance of one-dimensional data cannot be negative, so this class avoids possible underflows by tracking logarithm of the variance instead.

Similarly, sum of weights is also tracked as a logarithm.

add(scalar_value, log_weight=0.0)

Return updated stats corresponding to addition of another sample.

Parameters
  • scalar_value (float) – The scalar value of the sample.

  • log_weight (float) – Natural logarithm of weight of the sample. Default: 0.0 (as log of 1.0).

Returns

Updated self.

Return type

ScalarStatTracker

copy()

Return new ScalarStatTracker instance with the same state as self.

The point of this method is the return type, instances of subclasses can get rid of their additional data this way.

Returns

New instance with the same core state.

Return type

ScalarStatTracker

class resources.libraries.python.PLRsearch.stat_trackers.VectorStatTracker(dimension=2, log_sum_weight=None, averages=None, covariance_matrix=None)

Bases: object

Class for tracking multi-dimensional samples.

Contrary to one-dimensional data, multi-dimensional covariance matrix contains off-diagonal elements which can be negative.

But, sum of weights is never negative, so it is tracked as a logarithm.

The code assumes every method gets arguments with correct dimensionality as set in constructor. No checks are performed (for performance reasons).

TODO: Should we provide a subclass with the dimensionality checks?

add_get_shift(vector_value, log_weight=0.0)

Return shift and update state to addition of another sample.

Shift is the vector from old average to new sample. For most callers, returning shift is more useful than returning self.

Parameters
  • vector_value (iterable of float) – The value of the sample.

  • log_weight (float) – Natural logarithm of weight of the sample. Default: 0.0 (as log of 1.0).

Returns

Shift vector

Return type

list of float

add_without_dominance_get_distance(vector_value, log_weight=0.0)

Update stats, avoid having the sample dominate, return old distance.

If the weight of the incoming sample is far bigger than the weight of all the previous data together, covariance matrix would suffer from underflow. To avoid that, this method manipulates both weights before calling add().

The old covariance matrix (before update) can define a metric. Shift is a vector, so the metric can be used to compute a distance between old average and new sample.

Parameters
  • vector_value (iterable of float) – The value of the sample.

  • log_weight (float) – Natural logarithm of weight of the sample. Default: 0.0 (as log of 1.0).

Returns

Updated self.

Return type

VectorStatTracker

copy()

Return new instance with the same state as self.

The main usage is to simplify debugging. This method allows to store the old state, while self continues to track towards new state.

Returns

Created tracker instance.

Return type

VectorStatTracker

reset()

Return state set to empty data of proper dimensionality.

Returns

Updated self.

Return type

VectorStatTracker

unit_reset()

Reset state but use unit matric as covariance.

Returns

Updated self.

Return type

VectorStatTracker