EN / RU / 🤖
← Back to essays
· Essay · 1 min

CoLT5: Faster Long-Range Transformers with Conditional Computation

CoLT5 processes up to 64,000 tokens, enabling work with large texts.

<p>A large number of text processing tasks require handling huge input sizes. The main issue is that working with large documents in transformers (a type of neural network) is an expensive endeavor. Taking the hypothesis that not all parts of the input data are equally useful, researchers built a network capable of processing up to 64,000 tokens (about 100 pages). This makes it feasible to process entire books or large articles.</p>

<p>Paper: <a href="https://arxiv.org/abs/2303.09752">https://arxiv.org/abs/2303.09752</a><br />
Github: <a href="https://github.com/lucidrains/CoLT5-attention">https://github.com/lucidrains/CoLT5-attention</a></p>;