tf.contrib.kfac.fisher_factors.NaiveDiagonalFactor

Class NaiveDiagonalFactor

Inherits From: DiagonalFactor

Defined in tensorflow/contrib/kfac/python/ops/fisher_factors.py.

FisherFactor for a diagonal approximation of any type of param's Fisher.

Note that this uses the naive "square the sum estimator", and so is applicable to any type of parameter in principle, but has very high variance.

Properties

name

Methods

__init__

__init__(
    params_grads,
    batch_size
)

Initializes NaiveDiagonalFactor instance.

Args:

  • params_grads: Sequence of Tensors, each with same shape as parameters this FisherFactor corresponds to. For example, the gradient of the loss with respect to parameters.
  • batch_size: int or 0-D Tensor. Size

get_cov

get_cov()

Get full covariance matrix.

Returns:

Tensor of shape [n, n]. Represents all parameter-parameter correlations captured by this FisherFactor.

get_cov_var

get_cov_var()

Get variable backing this FisherFactor.

May or may not be the same as self.get_cov()

Returns:

Variable of shape self._cov_shape.

instantiate_cov_variables

instantiate_cov_variables()

Makes the internal cov variable(s).

instantiate_inv_variables

instantiate_inv_variables()

Makes the internal "inverse" variable(s).

left_multiply_matpower

left_multiply_matpower(
    x,
    exp,
    damping_func
)

Left multiplies 'x' by matrix power of this factor (w/ damping applied).

This calculation is essentially: (C + damping * I)exp * x where * is matrix-multiplication, is matrix power, I is the identity matrix, and C is the matrix represented by this factor.

x can represent either a matrix or a vector. For some factors, 'x' might represent a vector but actually be stored as a 2D matrix for convenience.

Args:

  • x: Tensor. Represents a single vector. Shape depends on implementation.
  • exp: float. The matrix exponent to use.
  • damping_func: A function that computes a 0-D Tensor or a float which will be the damping value used. i.e. damping = damping_func().

Returns:

Tensor of same shape as 'x' representing the result of the multiplication.

make_covariance_update_op

make_covariance_update_op(ema_decay)

Constructs and returns the covariance update Op.

Args:

  • ema_decay: The exponential moving average decay (float or Tensor).

Returns:

An Op for updating the covariance Variable referenced by _cov.

make_inverse_update_ops

make_inverse_update_ops()

Create and return update ops corresponding to registered computations.

register_matpower

register_matpower(
    exp,
    damping_func
)

right_multiply_matpower

right_multiply_matpower(
    x,
    exp,
    damping_func
)

Right multiplies 'x' by matrix power of this factor (w/ damping applied).

This calculation is essentially: x * (C + damping * I)exp where * is matrix-multiplication, is matrix power, I is the identity matrix, and C is the matrix represented by this factor.

Unlike left_multiply_matpower, x will always be a matrix.

Args:

  • x: Tensor. Represents a single vector. Shape depends on implementation.
  • exp: float. The matrix exponent to use.
  • damping_func: A function that computes a 0-D Tensor or a float which will be the damping value used. i.e. damping = damping_func().

Returns:

Tensor of same shape as 'x' representing the result of the multiplication.