Vector Capsule Networks Layers

Layers implementation in Vector Capsule Networks with Dynamic Routing. A complete tutorial to build a vector capsule networks using NeoPulse® could be found here.

PrimaryCaps_Vector

PrimaryCaps_Vector:[channels=32, capsule_dim=16, kernel_size=9, strides=2, padding='valid', kernel_initializer='glorot_uniform', bias_initializer='zeros']

PrimaryCaps Layer in Vector Capsule Networks with Dynamic Routing.

This layer takes basic features detected by the convolutional layer, produces combinations of the features and groups features into vector capsules. This is achieved using three distinct processes: Convolution, Reshape, and Squash, where Squash function makes sure that the length of each capsule vector is between 1 and 0 and will not destroy the positional information located in higher dimensions of the capsule vector.

Arguments

  • channels: Integer, the number of output capsules in each spatial point, similar to number of filters in Convolutional layers.

  • capsule_dim: Integer, the dimensinality of each output vector capsule.

  • kernel_size: Integer or a list of single integer, specifying the kernel size of Convolution process.

  • strides: Integer or a list of single integer, specifying the strides of Convolution process.

  • padding: One of "valid" or "same", padding pattern in Convolution process.

  • kernel_initializer: Specifying the kernel initializer type for Convolution process.(see keras initializer)

  • bias_initilaizer: Specifying the bias initializer type for Convolution process.(see keras initializer)

Input shape:

4D tensor of shape [batch, height, width, input_channels]

Output shape:

3D tensor of shape [batch, num_capsule, capsule_dim]

DigitCaps

DigitCaps:[num_capsule, capsule_dim, routings=3, kernel_initializer='glorot_uniform']

DigitCaps Layer in Vector Capsule Networks with Dynamic Routing.

This layer is the higher level capsule layer which the Primary Capsules would route to(using dynamic routing). DigitCap is just a similar extension of a dense layer. Instead of taking scalars and output scalars, it takes vector capsules and output vector capsules.

Arguments

  • num_capsule: Integer, the number of output capsule vectors.

  • capsule_dim: Integer, the dimensionality of output capsule vectors.

  • routings: Integer, the number of iterations for Dynamic Routing algorithm.

  • kernel_initializer: Specifying the initializer type for the Transform Matrix.(see keras initializer)

Input shape:

3D tensor of shape [batch, input_numb_capsule, input_dim_capsules]

Output shape:

3D tensor of shape [batch, num_capsule, capsule_dim]

ClassCaps

ClassCaps:[num_capsule]

Compute the length of capsule vectors.

Arguments

  • num_capsule: Integer, the number of input capsule vectors, which equals to number of classes.

Input shape:

3D tensor of shape [batch, num_capsule, capsule_dim]

Output shape:

2D tensor of shape [batch, num_capsule]

Squash

Squash:[axis=-1]

The non-linear activation used in Vector Capsule networks. It drives the length of a large vector to near 1 and small vector to 0.

Note: "PrimaryCaps_Vector" and "DigitCaps" layers have included this process. If you use "PrimaryCaps_Vector" and "DigitCaps", "Squash" is not needed to be listed in architecture separately.

Arguments

  • axis: Ingeter, the axis to squash.

Input shape:

Arbitrary.

Output shape:

Same shape as input.

MaskCaps

MaskCaps:[]

Mask a Tensor with shape=[None, num_capsule, dim_vector] either by the capsule with max length or by an additional input mask. Except the max-length capsule (or specified capsule), all vectors are masked to zeros. Then flatten the masked Tensor.

Input shape:

3D tensor of shape [batch, num_capsule, capsule_dim]

optional 2D Mask of shape [batch, num_capsule]

Output shape:

2D tensor of shape [batch, num_capsule * capsule_dim]

Matrix Capsule Networks Layers

Layers implementation in Matrix Capsule Networks with EM Routing. A complete tutorial to build a matrix capsule networks using NeoPulse® could be found here.

PrimaryCaps_Matrix

PrimaryCaps_Matrix:[channels=32, pose_size=4, kernel_size=1, strides=1, padding='valid', kernel_initializer='glorot_uniform', bias_initializer='zeros']

PrimaryCaps Layer in Matrix Capsule Networks with EM Routing.

This layer takes basic features detected by the convolutional layer, and applys a convolution filter(often 1x1), transforming basic features into primary capsules. Each capsule contains a pose matrix(often 4x4) and an activation value. We use the regular convolution layer to implement the PrimaryCaps.

Arguments

  • channels: Integer, the number of output capsules in each spatial point, similar to number of filters in Convolutional layers.

  • pose_size: Integer or a list of a single integer, specifying the size of pose matrix of a capsule.

  • kernel_size: Integer or a list of a single integer, specifying the kernel_size of convolutional process.

  • strides: Integer or a list of a single integer, specifying the strides of convolutional process.

  • padding: One of "valid" or "same", padding pattern in Convolution process.

  • kernel_initializer: Specifying the kernel initializer type for Convolution process.(see keras initializer)

  • bias_initializer: Specifying the bias initializer type for Convolution process.(see keras initializer)

Input shape:

4D tensor of shape [batch, input_height, input_width, input_channel]

Output shape:

pose: 6D tensor of shape [batch, output_height, output_width, channels, pose_height, pose_width]

activation [batch, output_height, output_width, channels]

ConvCaps

ConvCaps:[channels, kernel_size, strides, routings=3, kernel_initializer='glorot_normal']

Convolution Capsule layer for Matrix Capsule networks with EM Routing.

ConvCaps takes capsules as input and output capsules. ConvCaps is similar to a regular convolution layer except it uses EM routing to compute the capsule output.

Arguments

  • channels: Integer, the number of output capsules in each spatial point, similar to number of filters in Convolutional layers.

  • kernel_size: Integer or a list of a single integer, specifying the kernel_size of capsule convolutional process.

  • strides: Integer or a list of a single integer, specifying the strides of capsule convolutional process.

  • routings: Integer, the number of iterations for EM Routing algorithm.

  • kernel_initializer: Specifying the kernel initializer type for EM Routing parameters.(see keras initializer)

Input shape:

pose: 6D tensor of shape [batch, input_height, input_width, input_channels, input_pose_height, input_pose_width]

activation: 4D tensor of shape [batch, input_height, input_width, input_channels]

Output shape:

pose: 6D tensor of shape [batch, output_height, output_width, channels, pose_height, pose_width]

activation [batch, output_height, output_width, channels]

ClassCaps

ClassCaps:[num_capsule, routings=3, kernel_initializer='glorot_normal']

Class capsules layer for Matrix Capsule networks with EM Routing.

This layer integrates the feature from ConvCaps using EM Routing and outputs one capsule for each class.

Arguments

  • num_capsule: Integer, the number of outputs capsule, which equals to the number of classes.

  • routings: Integer, the number of iterations for EM Routing algorithm.

  • kernel_initializer: Specifying the kernel initializer type for EM Routing parameters.(see keras initializer)

Input shape:

pose: 6D tensor of shape [batch, input_height, input_width, input_channels, input_pose_height, input_pose_width]

activation: 4D tensor of shape [batch, input_height, input_width, input_channels]

Output shape:

pose: 4D tensor of shape [batch, num_capsule, pose_height, pose_width]
activation 2D tensor of shape: [batch, num_capsule]

Image Detection Layers

Layers implementation for Image Detector Problem. A complete tutorial to build a SSD model using NeoPulse® could be found here.

AnchorBoxes

AnchorBoxes:[img_height, img_width, this_scale, next_scale, aspect_ratios=[0.5, 1.0, 2.0], two_boxes_for_ar1=True, this_steps=None, this_offsets=None, clip_boxes=True, variances=[0.1, 0.1, 0.2, 0.2], coords=None, normalize_coords=True]

A layer to create an output tensor containing anchor box coordinates and variances based on the input tensor and the passed arguments.

A set of 2D anchor boxes of different aspect ratios is created for each spatial unit of the input tensor. The number of anchor boxes created per unit depends on the arguments aspect_ratios and two_boxes_for_ar1, in the default case it is 4. The boxes are parameterized by the coordinate tuple (xmin, xmax, ymin, ymax).

Arguments

  • img_height: Integer, the width of the input images.

  • img_width: Integer, the width of the input images.

  • this_scale: A float in [0, 1], the scaling factor for the size of the generated anchor boxes as a fraction of the shorter side of the input image.

  • next_scale: A float in [0, 1], the next larger scaling factor. Only relevant if self.two_boxes_for_ar1 == True.

  • aspect_ratios: A list of floats, the list of aspect ratios for which default boxes are to be generated for this layer.

  • two_boxes_for_ar1: Bool, only relevant if aspect_ratios contains 1. If True, two default boxes will be generated for aspect ratio 1. The first will be generated using the scaling factor for the respective layer, the second one will be generated using geometric mean of said scaling factor and next bigger scaling factor.

  • this_step: A single integer, float or a list of 2 integers or floats. The step pixel values, i.e. how far apart the anchor box center points will be vertically and horizontally. The default value is [image_height/feature_map_height, image_width/feature_map_width].

  • this_offsets: A single integer, float or a list of 2 integers or floats. The offsets pixel values, i.e. at what pixel values the first anchor box center point will be from the top and from the left of the image. The default values is [0.5, 0.5].

  • clip_boxes: Bool, if True, clips the anchor box coordinates to stay within image boundaries.

  • variances: A list of 4 positive floats, the anchor box offset for each coordinate will be divided by its respective variance value.

  • coords: String, the box coordinate format to be used internally in the model (i.e. this is not the input format of the ground truth labels). Can be either 'centroids' for the format (cx, cy, w, h) (box center coordinates, width, and height), 'corners' for the format (xmin, ymin, xmax, ymax), or 'minmax' for the format (xmin, xmax, ymin, ymax).

  • normalize_coords: Bool, set to True if the model uses relative instead of absolute coordinates, i.e. if the model predicts box coordinates within [0,1] instead of absolute coordinates.

Input shape:

    4D tensor of shape [batch, height, width, channels]

Output shape:

    5D tensor of shape [batch, height, width, n_boxes, 8]. The last axis contains
    the four anchor box coordinates and the four variance values for each box.

L2Normalization

L2Normalization:[gamma_init=20]

Performs L2 normalization on the input tensor with a learnable scaling parameter as described in the paper "Parsenet: Looking Wider to See Better" and as used in the original SSD model.

Arguments

  • gamma_init: Integer, the initial scaling parameter.

Input shape:

4D tensor of shape [batch, height, width, channels]

Output shape:

The scaled tensor. Same shape as the input tensor.

Unsupervised Learning Layers

A complete tutorial to build a unsupervised learning model using NeoPulse® could be found here.

Kmeans

Kmeans:[n_clusters=8, init='k-means++', max_iter=100, batch_size=100, verbose=0, compute_labels=True, random_state=None, tol=0.0, max_no_improvement=10, init_size=None, n_init=3, reassignment_ratio=0.01, precompute_distances='auto', copy_x=True, n_jobs=None, algorithm='auto', batch=False]

Kmeans clustering layer for Unsupervised Learning. Two fitting mode are supported, data could either be feed in one time, or feed in batches.

Arguments

  • n_clusters: Integer, the number of clusters to form as well as the number of centroids to generate.

  • init: {'k-means++', 'random' or an ndarray}, Method for initialization, defaults to 'k-means++':

    • 'k-means++' : selects initial cluster centers for k-mean clustering in a smart way to speed up convergence. See section Notes in k_init for more details.

    • 'random': choose k observations (rows) at random from data for the initial centroids.

    • ndarray: if an ndarray is passed, it should be of shape (n_clusters, n_features) and gives the initial centers.

  • max_iter: Integer, maximum number of iterations of the k-means algorithm for a single run.

  • batch_size: Integer, only relevant when batch=True, size of the mini batches.

  • verbose: Bool, verbosity mode.

  • compute_labels: Bool, only relevant when 'batch=True', compute label assignment and inertia for the complete dataset once the minibatch optimization has converged in fit.

  • random_state: Integer, RandomState instance or None, determines random number generation for centroid initialization and random reassignment. Use an int to make the randomness deterministic.

  • tol: float, Relative tolerance with regards to inertia to declare convergence.

  • max_no_imporovement: Integer, only relevant when batch=True, control early stopping based on the consecutive number of mini batches that does not yield an improvement on the smoothed inertia.

  • init_size: Integer, only relevant when batch=True, number of samples to randomly sample for speeding up the initialization (sometimes at the expense of accuracy): the only algorithm is initialized by running a batch KMeans on a random subset of the data. This needs to be larger than n_clusters.

  • n_init: Integer.

    When batch=False, it is the number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.

    When batch=True, it is the number of random initializations that are tried. In contrast to batch=False, the algorithm is only run once, using the best of the n_init initializations as measured by inertia.

  • reassignment_ratio: float, control the fraction of the maximum number of counts for a center to be reassigned. A higher value means that low count centers are more easily reassigned, which means that the model will take longer to converge, but should converge in a better clustering.

  • precompute_distances: {‘auto’, True, False}, only relevant when batch=False, precompute distances (faster but takes more memory).

    • ‘auto’ : do not precompute distances if n_samples * n_clusters > 12 million. This corresponds to about 100MB overhead per job using double precision.

    • True : always precompute distances

    • False : never precompute distances

  • copy_x: Bool, only relevant when batch=False, when pre-computing distances it is more numerically accurate to center the data first. If copy_x is True (default), then the original data is not modified, ensuring X is C-contiguous. If False, the original data is modified, and put back before the function returns, but small numerical differences may be introduced by subtracting and then adding the data mean, in this case it will also not ensure that data is C-contiguous which may cause a significant slowdown.

  • n_jobs: Integer, only relevant when batch=False, the number of jobs to use for the computation. This works by computing each of the n_init runs in parallel. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.

  • algorithm: {“auto”, “full” or “elkan”}, only relevant when batch=False, K-means algorithm to use. The classical EM-style algorithm is “full”. The “elkan” variation is more efficient by using the triangle inequality, but currently doesn’t support sparse data. “auto” chooses “elkan” for dense data and “full” for sparse data.

  • batch: Bool, fitting mode, if batch=True, data will be feed in batches. If batch=False, data will be feed in one time.

Input shape:

2D tensor of shape [batch, dim]

Output shape:

1D tensor of shape [batch]

Pca

Pca:[n_components=None, copy=True, whiten=False, svd_solver='auto', tol=0.0, iterated_power='auto', random_state=None, batch_size=None, batch=False]

PCA decomposition layer for Unsupervised Learning. Two fitting mode are supported, data could either be feed in one time, or feed in batches.

Arguments

  • n_components: Integer, Number of components to keep.

  • copy: Bool, if False, X will be overwritten.

  • whiten: Bool.

    When whiten=True the components_ vectors are multiplied by the square root of n_samples and then divided by the singular values to ensure uncorrelated outputs with unit component-wise variances.

    Whitening will remove some information from the transformed signal (the relative variance scales of the components) but can sometime improve the predictive accuracy of the downstream estimators by making their data respect some hard-wired assumptions.

  • svd_solver: string, {‘auto’, ‘full’, ‘arpack’, ‘randomized’}, only relevant when batch=False.

    • auto : the solver is selected by a default policy based on X.shape and n_components: if the input data is larger than 500x500 and the number of components to extract is lower than 80% of the smallest dimension of the data, then the more efficient ‘randomized’ method is enabled. Otherwise the exact full SVD is computed and optionally truncated afterwards.

    • full : run exact full SVD calling the standard LAPACK solver via scipy.linalg.svd and select the components by postprocessing

    • arpack : run SVD truncated to n_components calling ARPACK solver via scipy.sparse.linalg.svds. It requires strictly 0 < n_components < min(X.shape)

    • randomized : run randomized SVD by the method of Halko et al.

  • tol: Float >= 0, only relevant when batch=False. Tolerance for singular values computed by svd_solver == ‘arpack’.

  • iterated_power: Integer >= 0, or ‘auto’, only relevant when batch=False. Number of iterations for the power method computed by svd_solver == ‘randomized’.

  • random_state: Integer, RandomState instance or None, only relevant when batch=False.

    If integer, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random. Used when svd_solver == ‘arpack’ or ‘randomized’.

  • batch_size: Integer or None, only relevant when batch=True. The number of samples to use for each batch. If batch_size is None, then batch_size is inferred from the data and set to 5 * n_features, to provide a balance between approximation accuracy and memory consumption.

  • batch: Bool, fitting mode, if batch=True, data will be feed in batches. If batch=False, data will be feed in one time.

Input shape:

2D tensor of shape [batch, dim]

Output shape:

2D tensor of shape [batch, n_components]

UnsupervisedFlatten

UnsupervisedFlatten:[]

Flatten any shape of sample into 1D vector in Unsupervised Learning.

Input shape:

Arbitrary.

Output shape:

2D tensor with shape [batch, flatten_dim]