Policies
The agent follows "policies" to make decisions.
The policy defines the mapping from the environment-observations to the actions that are being taken.
Within Dojo, the policy can be referred to as the agent’s DeFi strategy. This is where you can get creative by implementing your own strategy!
Purpose
Policies generally provide the following functionality:
- Testing:
the policy
predict(obs)
method is run to generate a sequence of actions to be run in the block. - Training:
the policy
fit(*args, **kwargs)
method can be implemented and run to optimize the policy parameters to maximize the agent reward function, or any other optimization framework you like.
Policy Implementation
To test out your own DeFi policy, all you need to do is implement the predict(obs)
method.
It takes the observation object from the environment as input and returns a list of actions as output.
Example 1: Test your DeFi strategy
In this example, we consider a basic policy for trading on UniswapV3, where we define an upper and lower spot-price and once the price reaches the upper or lower limit, all tokens are converted to the currently lower value token:
Example 2: Train your DeFi strategy
If you want to take it one step further, dojo allows you to encode a parametric model in your policy and optimize it however you want.
To show you how, we take the static policy from Example 1, but let you train your strategy to tune the upper and lower limit parameter to improve the performance of your strategy.
To start with, we might think that when volatility is high, the spread between limits should be further apart. Let’s implement the simplest way of doing this:
These are just examples of testing and training policies to get you started. You can get a lot more creative and sophisticated than this.