uncoverml.transforms package

Submodules

uncoverml.transforms.impute module

class uncoverml.transforms.impute.GaussImputer

Bases: object

Gaussian Imputer.

This imputer fits a Gaussian to the data, then conditions on this Gaussian to interpolate missing data. This is effectively the same as using a linear regressor to impute the missing data, given all of the non-missing dimensions.

Have a look at:

https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions

We use the precision (inverse covariance) form of the Gaussian for computational efficiency.

class uncoverml.transforms.impute.MeanImputer

Bases: object

Simple mean imputation.

Replaces the missing values in x, with the mean of x.

class uncoverml.transforms.impute.NearestNeighboursImputer(nodes=500, k=3)

Bases: object

Nearest neighbour imputation.

This builds up a KD tree using random points (without missing data), then fills in the missing data in query points with values from thier average nearest neighbours.

Parameters
  • nodes (int, optional) – maximum number of points to use as nearest neightbours.

  • k (int, optional) – number of neighbours to average for missing values.

uncoverml.transforms.impute.impute_with_mean(x, mean)

uncoverml.transforms.linear module

class uncoverml.transforms.linear.CentreTransform

Bases: object

class uncoverml.transforms.linear.LogTransform(stabilizer=1e-06)

Bases: uncoverml.transforms.linear.PositiveTransform

class uncoverml.transforms.linear.PositiveTransform(stabilizer=1e-06)

Bases: object

class uncoverml.transforms.linear.SqrtTransform(stabilizer=1e-06)

Bases: uncoverml.transforms.linear.PositiveTransform

class uncoverml.transforms.linear.StandardiseTransform

Bases: object

class uncoverml.transforms.linear.WhitenTransform(keep_fraction)

Bases: object

uncoverml.transforms.onehot module

class uncoverml.transforms.onehot.OneHotTransform

Bases: object

class uncoverml.transforms.onehot.RandomHotTransform(n_features, seed)

Bases: object

uncoverml.transforms.onehot.compute_unique_values(x)

compute per-dimension unique values over a data vector

This function computes the set of unique values for each dimension in x, unless the number of unique values exceeds max_onehot_dims.

Parameters

x (ndarray (n x m)) – The array over which to compute unique values. The set is over the first dimension

Returns

x_sets – A list of m sets of unique values for each dimension in x

Return type

list of ndarray or None

uncoverml.transforms.onehot.one_hot(x, x_set, matrices=None)
uncoverml.transforms.onehot.sets(x)

works on a masked x

uncoverml.transforms.target module

class uncoverml.transforms.target.Identity

Bases: object

fit(y)
itransform(y_transformed)
transform(y)
class uncoverml.transforms.target.KDE

Bases: uncoverml.transforms.target.Identity

fit(y)
itransform(y_transformed)
transform(y)
class uncoverml.transforms.target.Log(offset=0.0, replace_zeros=True)

Bases: uncoverml.transforms.target.Identity

fit(y)
itransform(y_transformed)
transform(y)
class uncoverml.transforms.target.Logistic(scale=1)

Bases: uncoverml.transforms.target.Identity

itransform(y_transformed)
transform(y)
class uncoverml.transforms.target.RankGaussian

Bases: uncoverml.transforms.target.Identity

Forces the marginal histogram to be Gaussian.

fit(y)
itransform(y_transformed)
transform(y)
class uncoverml.transforms.target.Sqrt(offset=0.0)

Bases: uncoverml.transforms.target.Identity

itransform(y_transformed)
transform(y)
class uncoverml.transforms.target.Standardise

Bases: uncoverml.transforms.target.Identity

fit(y)
itransform(y_transformed)
transform(y)

uncoverml.transforms.transformset module

class uncoverml.transforms.transformset.ImageTransformSet(image_transforms=None, imputer=None, global_transforms=None, is_categorical=False)

Bases: uncoverml.transforms.transformset.TransformSet

class uncoverml.transforms.transformset.TransformSet(imputer=None, transforms=None)

Bases: object

uncoverml.transforms.transformset.build_feature_vector(image_chunks, is_categorical)
uncoverml.transforms.transformset.missing_percentage(x)

Module contents