DeFi Arbitrage Bots

Mike Radin, 2023/08/20

Old News

The code discussed here is old, it was written in 2021. Since then, not only did the Tezos DeFi ecosystem change, but the core platform itself has seen upgrades. The code is provided without warranty, etc.

DeFi Pool Primer

DeFi markets, while ostensibly offering stock-market-like functionality, have important differences in their operation. For these markets to be trustless and decentralized, asset prices must be controlled by software in some automated way. The first popular deployment of this was the Uniswap AMM (Automated Market Maker) pool (or dex, Decentralized EXchange) in 2018. An implementation of a constant product automated market, it operated on the idea that the price of Asset A in terms of Asset B is a ratio and will float as users swap A for B and vice versa. Because there are also external inputs like centralized markets along the lines of Binance and Coinbase the ratio (or price) on the AMM will not be wildly different from other sources since that would create lucrative arbitrage opportunities. There are other types of markets that use oracle data to set asset prices.

When such a market (a smart contract) is deployed it has some amount of Asset A and B. For example, at the time of writing Eth is at 1664 USD, so if deploying a new AMM now the ratio of Eth to DAI would be 1:1664. Say we seed the market with liquidity of 10 Eth, then we'd also need to offer 16,640 DAI. With this initial liquidity other market liquidity providers can join at the same ratio. Say the liquidity grew to 1,000 Eth and 1,664,000. Now when a user were to swap 10 DAI for Eth they would get something like 0.006 Eth. Such a small trade, relative to total liquidity, wouldn't affect the asset ratio much and the "price" of Eth would remain roughly the same. For a larger trade, say of 200 Eth for DAI, the ration would move more significantly in the case of a simple constant product market, so the user would expect to get less than 332,800 that you'd arrive at with simple multiplication. This price change is called slippage. Additionally, liquidity providers are incentivized through fees. It's common for them to get 0.05% of the trade output, meaning that a market user would get 0.05% less output token that dictated by the pure constant product function. There are other types of CFMM (Constant Function Market Maker) markets that have more complex curves.

The example above is relatable, but not a great one. It talks about a market that swaps a coin (a native blockchain gas currency) for a token (in this case ERC20). These are two different types of assets and they are treated differently in the code. On other chains there may also be extra considerations for the native coin. There are some projects that normalize their platform around some artificial token. Taking Plenty as an example where there was no XTZ/USDtz market instead there was a Plenty/USDtz market or a CTEZ/USDtz market. Either of these options introduce economic and software complexity. On the other hand, QuipuSwap has an USDtz/ETHtz market which is conceptually easy to understand.

Theory of Operation

Depending on the depth of liquidity on matched pools (two or more pools for the same asset pair), the size and configuration of a pending trade it may be possible to make a profit by rebalancing the pools before or after the target trade operation is included in the block (meaning, "executed") by a validator.

For arbitrage to work not only do the bot trades and the target trades must be included in the same block, but they must also be ordered correctly. The bot trade must be immediately adjacent to the target trade. There are two possible arbitrage opportunities: front-running and trailing.

Front-running arbitrage is not common. It requires the target trade to be poorly configured, something that recent DeFi UIs actively discourage. To be targets, these trades need to effectively swap Asset A for B at market price. Meaning they won't set a minimum output amount thus implicitly setting a price for Asset B in terms of Asset A. In this scenario the arbitrage bot can submit a trade that increases the price of B for A, then the target trade will raise the price further and finally the arbitrage bot can execute the reverse trade to make a profit right after the target trade. This sort of trade sandwich would happen against a single pool. It works because the target trade doesn't care for how much output token it gets for its input.

Trailing arbitrage is more common, but it requires at least two pools for the same asset pair to operate. The target trade swapping Asset A for B is submitted to the first pool which will alter the A/B ratio in that pool. The bot can then swap B/A in another pool and swap it back A/B on the first pool to make a profit thereby balancing the pools again. It could also be the case that the bot can execute the reverse operation on the original pool and then sell the asset on the other pool for a profit. Pools of the same asset pairs are generally balanced within a small margin exactly because these arbitrage bots are common. A skilled DeFi user would submit the A/B swap to both pools in the same block together to reduce the amount of profit an arb bot can make. This would get the user a better A/B ratio and more B asset for their A asset.

Finding Opportunities

Here we'll concentrate on trailing arbitrage. We need to find trade operation groups that may unbalance a pool enough that we can pull matching tokens from another one for a profit.

Finding the right operation in the mempool will require knowledge of the specific AMM pool and asset tokens. Let's take an example from block 1622705. Here we see operation ...Xzzw where some account is exchanging tzBTC for XTZ on QuipuSwap. Here user was attempting to swap 0.00747332 tzBTC for a minimum of 96.967539 XTZ. Prior to this operation the amount of tzBTC held in the QuipuSwap pool was 0.64496535 and 8,533.072917 XTZ. Based on this we'd expect the outcome of the trade to be 0.00747332 * 8533.072917 * 997 / (0.64496535 * 1000 + 0.00747332 * 997) = 97.451701 after accounting for a 0.3% fee, which is what this account gets at the end of that operation group. See getTokenToCashExchangeRate(), exchangeMultiplier in this case is 997 because 1 - 997/1000 is 0.3%. The resulting pool balance is 0.65243867 tzBTC to 8,435.621216 XTZ.

This adjustment causes an imbalance relative to the Sirius pool. Now we see operation ...KF73 in which the bot account swaps XTZ/tzBTC and then tzBTC/XTZ for a profit of 0.89901 XTZ after transaction fees. Not an exciting amount on its own, but adds up over hundreds of trades. The bot executes the opposite direction of the target trade. The bot takes into account the XTZ balance it has available to work out how much it can possibly arbitrage. The formula in the previous paragraph gives us the coin (referred to as cash in the code), here's how we get token output using 100 XTZ as an example. 100 * 0.65243867 * 997 / (8435.621216 * 1000 + 100 * 997) = 0.00762105 100 XTZ would yield 762105 sats from QuipuSwap, what can we get for them on Sirius (the code referrers to Sirius as ArthurSwap, "liquidity baking" was his attempt at increasing DeFi liquidity on Tezos)? Using the same formula in the previous paragraph we get 0.00762105 * 1093481.852659 * 997 / (82.32168282 * 1000 + 0.00762105 * 997) = 100.917672. There is profit to be made if the transaction can be executed for less than 0.91 XTZ.

Pricing Operations

Tezos doesn't have an explicit gas price nor does it have a way to prioritize operations within a block at the platform level. There is an implicit way to do this however. Validators optimize for profit as well, so they'll fill the block with operations up to the block gas limit starting with the highest fee operations first. The actual ordering is operation_fee / submitted_operation_gas. What this means is that we can manipulate the ordering of the operations based on this ratio. Technically validators can prioritize operations anyway they want, but very few are running custom nodes that deviate from the default behavior. In the example above the target trade gas to fee ratio was 0.1310097807 (this is not entirely correct, the operation storage "gas" is also taken into account – that is the cost of storing the operation bytes in the block, not the cost of operation contract storage adjustment). So as long as the bot submits its arb operation with a fee to gas ratio of slightly less than 0.1310097807 that operation will be ordered immediately behind the target. Note that the ratio is not calculated off actual gas use by the operation but by the gas limit provided in the operation. Further, gas and fee are taken for the aggregate operation group, not for individual operations. Hence, at the time of that code being written, the bot has up to 1,400,000 gas units per operation and up to the 5,200,000 block gas limit (less higher priority operations ahead of it in the current blocl) and up to 0.91 XTZ in fees to get the ratio where it needs to be. The base operation gas cost is calculated by the bot at the start of the session to save time since it generally wont change much. For example having the bot retain a small token balance after each trade helps stabilize the gas cost. Each of the operations in the group must have the gas limit set to at least the minimum required. The upper limit of the gas is whatever is necessary to get the fee-to-gas ratio as close as possible to the target trade while remaining below. If the necessary gas is greater than the operation limit, just add the rest to the next operation in the group and so on. The fee can be attached only to the first operation. Finally if the operation group of three operations is not enough to get to the appropriate ratio, it's possible to pad the operation group with self-transfers to consume additional gas limit. This is likely irrelevant for the latest protocol version due to reduced block gas limit.

Taking Profit

These bots try to maintain a near-zero balance of asset tokens and instead trade in and out of them during the arbitrage operation. This makes the operation group more complex but also provides more options for pricing the operation fee correctly. It also removes token economic risk where holding, say Plenty, over a long period of time may be undesirable.

Such indirect arbitrage is also supported in the code. In this case the bot will first swap XTZ for Plenty then Plenty for the target token, then target token back to Plenty and finally Plenty to XTZ.

Scaling

Software is hard and Tezos is cheap. It might be tempting to architect these bots in such a way that a single application can manage multiple tokens and multiple pools, however running a bot per market pair significantly simplifies the design and makes the over all operation more resilient. It also allows for parallel arb execution since a single account can only include one operation group per block and if that operation somehow get stuck in the mempool beyond the expected block it will halt the entire arb process.

Tricks

Unlike Ethereum, Tezos didn't have a replace-by-fee feature in 2021. There was however a workaround where it was possible to submit an operation group from the same account with a higher fee to a different node and have that override an earlier submission. This code never moved past proof of concept and isn't include here. Since then some sort of replace by fee feature was added and stale operation mechanism was changed.

Code Overview

When reading the code keep in mind that blockchains operate on whole numbers, so there is an adjustment in thinking compared to floating point math.

While there are several libraries that simplify interaction with the Tezos chain, the code here does it manually. This provides us with more control over the execution flow and allows us to more readily optimize the code if needed.

/src/conductor Orchestration entry point, spawns bots based on available configuration files. Sample configuration files are in /config.

/src/bot/BotPlenty This bot manages swaps for Plenty/token markets. These markets have been deprecated and replaced.

/src/bot/BotSeven Named after TZIP-7, the rather incomplete token specification akin to ERC-20, this code manages swaps for token to XTZ. There is also support for TZIP-12 tokens though the only ones implemented are for Wrap tokens which have been deprecated due to exit of their issuer from the Tezos ecosystem.

/src/market This code manages market operation creation and identification as these will differ between implementations. Plenty market code is for v1 and won't work with their new deployment. Dexter market has been deprecated for some time. Other code may or may not be compatible with current market contracts.

/src/types Types and base classes.

/src/token Token definitions. w*Token.ts is for the dead Wrap tokens with the exception of wxtzToken which was created by a different entity and is fully decentralized.

/src/util/tezosUtil has the functions necessary for the operation of the bots described above. The code is based on ConseilJS, but removes a lot of the hand-holding.

/src/util/arbUtil has the math required to calculate both the arbitrage and the fees necessary to prioritize the operation as needed. This code is naive. If you improve it and want to share, kindly submit a pull request.

Known Issues

Several things have changed on Tezos since this code was last run on mainnet. Block gas limit was reduced from 5,200,000 to 2,600,000 since block time was reduced from 30 to 15 seconds in recent protocol upgrades. This will require changes to fee estimation code because it breaks the assumption that each of the 5 operations in the arbitrage group can have up to 1.04M gas. In the case of XTZ/Asset swaps the arbitrage group contains these operations:

Swap XTZ for Asset
Approve Asset for market
Swap Asset for XTZ

For intermediary tokens like Plenty it would be:

Swap XTZ for Plenty
Approve Plenty for market
Swap Plenty for Asset
Approve Asset for market
Swap Asset for XTZ

The bots in the linked repository don't do a great job of estimating the best arbitrage notional, this is a point of optimization.

Additional Context

By the end of 2021 dex market conditions weren't making these bots profitable. It was a combination of low liquidity, low volume and increased level of similar, adversarial bot activity. For example, there were trades from other bots that were committed at a loss, presumably to discourage competition.

There is a new token swap UI, 3route which optimizes trades in a way similar to these bots. Increased popularity of that tool will reduce profitability of these bots.

There is a recent project, similar to Ethereum MEV: Flashbake which is worth considering if the consortium gets big enough to create blocks often.

There are better ways to make this work, for example by modifying Tezos software, written in ocaml. Reading data from the node as binary instead of JSON may improve performance.

Lately there are many pools that pair an asset to the CTEZ token rather than XTZ directly. This complicates profit calculation. Consider that the pool that influences CTEZ "drift" calculation is only the 20th by amount of CTEZ in it and has 50x less liquidity than the top contract by balance.

There is more stuff in this code like various arbitrage modes, attempts to identify adversarial bots and more.