Roomba: automatic validation, correction and generation of dataset metadata