On using deep reinforcement learning to dynamically derive 5G new radio TDD pattern