Vaccines make protein at the wrong place

Ever since listening to Jacob Kimmel on Dwarkesh’s Podcast I was wondering about the following question: Given that you inject an LNP-mRNA vaccine (e.g. the BioNTech-Pfizer vaccine) in a random location, for example your arm, how do you make sure that whatever protein is encoded on your mRNA is only expressed in the right cell type?

This actually matters quite a lot because your LNP will concentrate in a few off-target locations and high expression in those cell-types may cause unwanted toxicity. From a BioNTech-EU report:

Over 48 hours, distribution was mainly observed to liver, adrenal glands, spleen and ovaries, with maximum concentrations observed at 8-48 hours post-dose.

NewLimit, Jacob Kimmels longevity company, cleverly exploits this skewed biodistribution by targeting liver cells for their first program. However, all current and future programs targeting other cell types will deal with this issue in the near-term. The chemists among you would raise a finger and say: “Oh but we will make sure the LNP doesn’t go into the wrong cells in the first place!” and while this is a very valid argument, it is just one lever to pull on.

The other lever that I got interested in is: assuming your LNP does go to the wrong cell, how do we minimize payload expression while we wait for the mRNA to get degraded?

Cell-type-specific mRNA sequence generation as latent-space optimization

At a recent Hackathon, a few smart teams found a bunch of interesting solutions to this problem of How to design mRNA sequences with cell-type specific expression?

One of them was to do a form of latent-space bayesian optimization with a simple MLP-encoder-decoder setup and RiboNN as the translational efficiency (TE) predictor. Apparently this works fairly well. One worry with this kind of setup is that you are inherently assuming that your TE predictor is reliable. This is not true in general and you might end up generating sequences that live in regions where your predictor is inaccurate. Sequence-to-expression models are generally known to do a lot of “look up”-predictions where the worst sequence to predict is one with some sequence identity that’s enough to fool the predictor and just enough difference to cause totally different expression. This sounds an awful lot like what you would do with latent space optimization where you start from a known sequence and potentially push it to extreme expression configurations where your predictor is confidently wrong. One potential way around this would be to bake into the model that generated sequences have to follow the natural distribution of mRNA sequences.

Generate RNA sequences with input protein sequence and translation efficiency autoregressively

This is how the idea for tropical was born: an autoregressive transformer model that generates mRNA sequences with protein sequence and translation efficiency conditioning. From a user perspective you give it a (short/zero) mRNA sequence prompt, a protein sequence representing your payload and a dictionary of translation efficiencies for all cell types as the input and get an optimized mRNA sequence as output.

Claude’s first pass at a frontend for tropical.

If we were not to condition on protein sequence and TE, this would just be a straight-forward decoder-only transformer. I am going to explain the exact conditioning in more detail but in short, tropical uses cross-attention to condition on the protein sequence and adaptive Layer Norm to condition on a translational efficiency vector. The beautiful thing about this setup is that we keep the loss and training task the same: autoregressive, self-supervised next-nucleotide prediction. The intuition on why this works is that conditioning the model boils down to giving hints that help it to predict the next token more accurately. To illustrate this, think about which task would be harder:

generate the correct RNA sequence from scratch or
generate it when you know it should encode a given protein sequence.

is easier because you massively restrict the plausible RNA sequences.

Conditioning on Protein Sequence with Cross-Attention

In contrast to self-attention, cross-attention does not use the same input for query and key values. Instead, we compute the query values from the output of the preceding self-attention layer and the key and value matrices from the protein encoder, effectively allowing the decoder to attend to encoded protein positions.

The key piece of the protein conditioning is that we add an additional cross-attention layer to each decoder block where query and value come from a simple protein encoder model.