RAN simulator is not what you need: O-RAN reinforcement learning for the wireless factory