RWKV-LM: RNN with transformer-level performance, without using attentiongithub.com/BlinkDL4 pointsvletal4 years ago