2022

CORL is an open-source library that provides single-file implementations of Deep Offline Reinforcement Learning algorithms. It emphasizes a simple developing experience with a straightforward codebase and a modern analysis tracking tool. In CORL, we …

Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows

Offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions. There are two major challenges in this setting: (1) extrapolation error caused by approximating the value of …

Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size

Training large neural networks is known to be time-consuming, with the learning duration taking days or even weeks. To address this problem, large-batch optimization was introduced. This approach demonstrated that scaling mini-batch sizes with …

CORL

High-quality single-file implementations of SOTA Offline RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC

Towards Interaction-based User Embeddings in Sequential Recommender Models

All transductive recommender systems are unable to make predictions for users who were not included in the training sample due to the process of learning user-specific embeddings. In this paper, we propose a new method for replacing identity-based …

Tinkoff.AI Saint Research

EXACT: How to Train Your Accuracy

Classification tasks are usually evaluated in terms of accuracy. However, accuracy is discontinuous and cannot be directly optimized using gradient ascent. Popular methods minimize cross-entropy, Hinge loss, or other surrogate losses, which can lead …

2022

Tinkoff Pro Open Source

Как работает отдел исследований искусственного интеллекта в «Тинькофф»

Tinkoff: Cracking AI Research