On human motion prediction using recurrent neural networks

Sequence-to-sequence model for human motion prediction.

class algorithm.humanmotionrnn.models.Seq2SeqModel(*args: Any, **kwargs: Any)[source]

Sequence-to-sequence model for human motion prediction

Parameters
  • architecture – [basic, tied] whether to tie the decoder and decoder.

  • source_seq_len (int) – lenght of the input sequence.

  • target_seq_len (int) – lenght of the target sequence.

  • rnn_size (int) – number of units in the rnn.

  • num_layers (int) – number of rnns to stack.

  • max_gradient_norm (float) – gradients will be clipped to maximally this norm.

  • batch_size (int) – the size of the batches used during training; the model construction is independent of batch_size, so it can be changed after initialization if this is convenient, e.g., for decoding.

# :param learning_rate: learning rate to start with. # :type learning_rate: float

# :param learning_rate_decay_factor: decay learning rate by this much when needed. # :type learning_rate_decay_factor: float

# :param loss_to_use: [supervised, sampling_based]. Whether to use ground truth in each timestep to compute the loss after decoding, or to feed back the prediction from the previous time-step. # :type learning_rate_decay_factor: float

Parameters
  • number_of_actions (int) – number of classes we have.

  • one_hot (bool) – whether to use one_hot encoding during train/test (sup models). default true

# :param residual_velocities: whether to use a residual connection that models velocities. :param dtype: the data type to use to store internal variables; default torch.float32

forward(encoder_inputs, decoder_inputs)[source]

Forward method for the model

Parameters
  • encoder_inputs – a tensor of shape [batch x length x dim]

  • decoder_inputs – a tensor of shape [batch x length x dim]

sample(encoder_inputs)[source]

Sampling poses given the input

Parameters

encoder_inputs – a tensor of shape [batch x length x dim]

algorithm.humanmotionrnn.params.HUMAN_SIZE = 54

human size

algorithm.humanmotionrnn.params.architecture = 'basic'

architecture version: Seq2seq architecture to use: [basic, tied]

algorithm.humanmotionrnn.params.batch_size = 16

batch size

algorithm.humanmotionrnn.params.learning_rate = 0.005

learning rate

algorithm.humanmotionrnn.params.learning_rate_decay_factor = 0.95

learning rate decay

algorithm.humanmotionrnn.params.loss_to_use = 'sampling_based'

The type of loss to use, supervised or sampling_based

algorithm.humanmotionrnn.params.max_gradient_norm = 5

maximum gradient norm

algorithm.humanmotionrnn.params.omit_one_hot = False

one-hot encoding when loading human3.6m dataset

algorithm.humanmotionrnn.params.print_every = 50

printing frequency during training

algorithm.humanmotionrnn.params.residual_velocities = False

Add a residual connection that effectively models velocities

algorithm.humanmotionrnn.params.rnn_num_layers = 1

rnn layer num

algorithm.humanmotionrnn.params.rnn_size = 1024

rnn hidden size

algorithm.humanmotionrnn.params.source_seq_len = 50

input sequence length

algorithm.humanmotionrnn.params.target_seq_len = 25

target sequence lenght

algorithm.humanmotionrnn.params.test_subject_ids = [5]

test subject id

algorithm.humanmotionrnn.params.train_subject_ids = [1, 6, 7, 8, 9, 11]

training subject ids