RWKV-LM: RNN with transformer-level performance, without using attentiongithub.com/BlinkDL5 pointsazhenley3 years ago