Architecture construct

The architecture construct is where the AI model is built. It contains three blocks as shown in this example:

architecture:
    input:  x ~ text: [200] ;
    output: y ~ flat: [2] ;

    x -> Embedding: [20000, 128]
      -> Dropout: [0.5]
      -> Conv1D: [64, 4]
      -> MaxPooling1D: [pool_size=4]
      -> LSTM: [128]
      -> Dense: [3, activation='sigmoid']
      -> y ;
  • input This defines a variable and shape for the input data. This must match what is defined in the input component of the source construct. The syntax for this block is:
input: variable ~ data_type: [shape] ;
  • output This defines a variable and shape for the output of the model. This must match what is defined in the output component of the source construct. The syntax for this block is:
output: variable ~ data_type: [shape] ;
  • neuralflow This is where the AI model itself is defined. Neuralflow is a series of network layers that connect the input to the output. NeoPulse® AI Studio is built on top of a subset of the Keras deep learning library, enabling machine learning experts to easily exploit the knowledge they already possess to quickly prototype new architectures.

auto

For classification tasks, the AI oracle can automatically determine an architecture based on the complexity of the problem, by using the auto keyword in the neuralflow declaration:

architecture:
    input:  x ~ text: [200] ;
    output: y ~ flat: [1] ;

    x -> auto -> y ;

We're constantly working to improve the AI oracle in NeoPulse® AI Studio, and automatic architectures for more tasks are coming soon.

Defining architectures

Using NML makes it very easy to design neural-network architectures using the neuralflow syntax. NeoPulse® AI Studio makes available most of the Keras deep learning library, making it easy to use the Keras layers API to build custom AI models. The NML syntax for layer calls is generically:

x -> LayerName: [argument1, argument2, named_argument=value...] -> ... -> y

The easiest way to understand it is to look at an example. Consider the following architecture construct:

architecture:
    input:  x ~ text: [200] ;
    output: y ~ flat: [2] ;

    x -> Embedding: [20000, 128]
      -> Dropout: [0.5]
      -> Conv1D: [64, 4]
      -> MaxPooling1D: [pool_size=4]
      -> LSTM: [128]
      -> Dense: [3, activation='sigmoid']
      -> y ;

In this example, text input is being sent through a six layer neuralflow, and the output is a binary classifier, i.e. the probability that the text belongs to each of three classes. This example architecture constructs an AI model using six layers:

  • Embedding: is called using the syntax: Embedding: [2000, 128]. The required arguments input_dim and output_dim are implicitly defined in order.

  • Dropout: is called using the syntax: Dropout: [0.5] the required argument rate is implicitly defined.

  • Conv1D: is called using the syntax: Conv1D[64, 4]. The required arguments filters and kernel_size are implicitly defined in order.

  • MaxPooling1D is called using the syntax: MaxPooling1D: [pool_size=4]. The default value of pool_size is overridden by specifying it directly.

  • LSTM: is called using the syntax: LSTM: [128]. The required argument units is implicitly defined.

  • Dense: is called using the syntax: Dense: [3]. The required argument units is implicitly defined.

The flow indicator -> connects layers from the input to the output.

Caveats

As of this writing there are some caveats:

Working with previously trained models

There are many reasons one might want to work with previously trained models. NeoPulse® AI Studio makes it easy to load a previously trained model for retraining using the from constructor.

Suppose that we obtain additional training data, and want to retrain a model on a larger dataset. Consider this NML script:

oracle("generated") = 2

source:
  bind = "/DM-Dash/my_project/data.csv" ;
  input:
    x ~ from "Review"
        -> text: [200] -> TextDataGenerator: [nb_words = 20000] ;
  output:
    y ~ from "Label" -> flat: [2] -> FlatDataGenerator: [] ;
  params:
    validation_split = 0.5,
    batch_size = 64,
    shuffle_init = True;

architecture:
    input:  x ~ text: [200] ;
    output: y ~ flat: [2] ;

    x -> Embedding: [20000, 128]
      -> Dropout: [auto(0.25 ? 0.50 | name = "Drop")]
      -> Convolution1D: [64, 4]
      -> MaxPooling1D: [pool_size = 4]
      -> LSTM: [128]
      -> Dense: [2, activation = 'sigmoid']
      -> y ;

train:
  compile:
    optimizer = 'rmsprop',
    loss = 'binary_crossentropy',
    metrics = ['accuracy'] ;
  run:
    epochs = 4 ;

  dashboard: ;

Let's retrain this model using new data:

  • First identify the most accurate model in the validation set:
$ neopulse top my_project --metrics=val_acc

Example output:

neopulse: top models for my_project
1) model: BJpp1v1fb-1 iter: weights-003.model val_acc: 0.8655485042418984
2) model: BJpp1v1fb-0 iter: weights-003.model val_acc: 0.8616500455470704

The path to the model we want to retrain is then:

/DM-Dash/projects/my_project/BJpp1v1fb-1/results/weights-003.model

  • Next, since we used the auto keyword, we'll determine the value of the Dropout rate used by this model:
$ neopulse list my_project -m BJpp1v1fb-1
MODEL STATUS METRICS DROP
BJpp1v1fb-1 TRAINING_COMPLETED val_loss,val_acc,loss,acc 0.50
  • Now we copy the original training script and make the following modifications (saving it as retrain.nml):

    1. Change the architecture construct declaration to use the from constructor and the path to the model we want to retrain:

      architecture from "/DM-Dash/projects/my_project/BJpp1v1fb-1/results/weights-003.model":

    2. Change the Dropout layer declaration to replace the auto keyword with the value for this model:

      -> Dropout: [0.50]

    3. If the name or path to the .csv file with the new training data is different then change that as well:

      bind = "/DM-Dash/my_project/new_data.csv" ;

  • Then we submit the retraining script:

$ neopulse train -p retrain_model -f retrain.nml

NOTE: When loading and retraining an AI model, you can vary hyper-parameters (like for example the Dropout rate), but you CANNOT change the shape of the layers.