The goal of this website is to annotate human whole-body motion with natural language. This dataset will hopefully enable us to learn a mapping between motion and language. Such a mapping would be pretty useful since it would allow us to generate motion from natural language, for example in a humanoid robot. However, we need your help to collect it!

The Dataset

If you're looking for the KIT Motion-Language Dataset, which contains the motion data and their annotations that have been collected through this website, look no further. You can freely download the dataset. The download page also contains additional information and code example to help you get started.

Getting Started


You'll first need an account. But don't worry, creating one is very easy and completely hassle-free. Simply sign up.

If you are a member of the H²T team, you don't even need to do that. Just sign in with your Redmine account!


As soon as you are signed in, we'll start showing you human whole-body motions like the one below.


We've made sure that you can inspect the motion as necessary. You can use the left mouse button to rotate the camera. Use the mouse wheel to zoom in and out. You can familiarize yourself with the controls in the example above. Give it a try now, it's fun!


The annotation process is really simple: After we've shown you a motion (like the one on the left), we ask you tell us in a single sentence and in plain English what you see. That's it!

To give you a concrete example, a possible annotation for the motion that you're watching right now could be:

“A person performs a single squat.”

To make sure that the data we collect is consistent, we ensure that only a single sentence is entered and that the majority of words are spelled correctly.

Additional Information

If you are interested, we are more than happy to give you more details of the nature and the purpose of this dataset. All motions were recorded at the High Performance Humanoid Technologies (H²T), which is part of the Faculty of Informatics at the Karlsruhe Insitute of Technology. All recorded motion is freely available through the KIT Whole-Body Human Motion Database. While these motions are annotated with simple tags (e.g. walk, run, ...), they lack more comprehensive annotations in the form of natural language.

This website was created during a student research program called “Praxis der Forschung”. During this program, several students can work on a project for two semesters and hopefully arrive at interesting results. In our case, we are interested in a better understanding of human motion and natural language. More precisely, we are interested in learning a bidirectional mapping that would allow us to essentially translate between motion and language. Such a system would be extremely useful in various scenarios. Imagine, for example, a humanoid robot that would be able to perform a motion in response to a command uttered in natural language.

To work on this project, we need a large dataset. This is why we created this website to help us annotate existing motions with natural language. We can then use this dataset to train machines to understand the connection between motion and language using machine learning techniques. We also plan to make this dataset freely available to everybody, so that everybody can experiment with it.