A deep reinforcement learning framework for scalable slice orchestration in beyond 5G networks