On human motion prediction using recurrent neural networks

Sequence-to-sequence model for human motion prediction.

class algorithm.humanmotionrnn.models.Seq2SeqModel(*args: Any, **kwargs: Any)[source]

Sequence-to-sequence model for human motion prediction

Parameters

architecture – [basic, tied] whether to tie the decoder and decoder.
source_seq_len (int) – lenght of the input sequence.
target_seq_len (int) – lenght of the target sequence.
rnn_size (int) – number of units in the rnn.
num_layers (int) – number of rnns to stack.
max_gradient_norm (float) – gradients will be clipped to maximally this norm.
batch_size (int) – the size of the batches used during training; the model construction is independent of batch_size, so it can be changed after initialization if this is convenient, e.g., for decoding.

# :param learning_rate: learning rate to start with. # :type learning_rate: float

# :param learning_rate_decay_factor: decay learning rate by this much when needed. # :type learning_rate_decay_factor: float

# :param loss_to_use: [supervised, sampling_based]. Whether to use ground truth in each timestep to compute the loss after decoding, or to feed back the prediction from the previous time-step. # :type learning_rate_decay_factor: float

Parameters

number_of_actions (int) – number of classes we have.
one_hot (bool) – whether to use one_hot encoding during train/test (sup models). default true

# :param residual_velocities: whether to use a residual connection that models velocities. :param dtype: the data type to use to store internal variables; default torch.float32

forward(encoder_inputs, decoder_inputs)[source]

Forward method for the model

Parameters

encoder_inputs – a tensor of shape [batch x length x dim]
decoder_inputs – a tensor of shape [batch x length x dim]

sample(encoder_inputs)[source]

Sampling poses given the input

Parameters: encoder_inputs – a tensor of shape [batch x length x dim]

algorithm.humanmotionrnn.params.HUMAN_SIZE = 54: human size

algorithm.humanmotionrnn.params.architecture = 'basic': architecture version: Seq2seq architecture to use: [basic, tied]

algorithm.humanmotionrnn.params.batch_size = 16: batch size

algorithm.humanmotionrnn.params.learning_rate = 0.005: learning rate

algorithm.humanmotionrnn.params.learning_rate_decay_factor = 0.95: learning rate decay

algorithm.humanmotionrnn.params.loss_to_use = 'sampling_based': The type of loss to use, supervised or sampling_based

algorithm.humanmotionrnn.params.max_gradient_norm = 5: maximum gradient norm

algorithm.humanmotionrnn.params.omit_one_hot = False: one-hot encoding when loading human3.6m dataset

algorithm.humanmotionrnn.params.print_every = 50: printing frequency during training

algorithm.humanmotionrnn.params.residual_velocities = False: Add a residual connection that effectively models velocities

algorithm.humanmotionrnn.params.rnn_num_layers = 1: rnn layer num

algorithm.humanmotionrnn.params.rnn_size = 1024: rnn hidden size

algorithm.humanmotionrnn.params.source_seq_len = 50: input sequence length

algorithm.humanmotionrnn.params.target_seq_len = 25: target sequence lenght

algorithm.humanmotionrnn.params.test_subject_ids = [5]: test subject id

algorithm.humanmotionrnn.params.train_subject_ids = [1, 6, 7, 8, 9, 11]: training subject ids