The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on May 10, 2021, is named LT01512PCT_SL.txt and is 195,012 bytes in size.
The current disclosure relates to mutant phage-type RNA polymerases, including mutant T7, SP6, and T3 RNA polymerases that can use 2′-modified nucleoside triphosphates or deoxynucleotide triphosphates as substrates. This disclosure also relates to methods for producing nucleic acid molecules using these mutant polymerases.
RNA stability is a major issue in RNA research and applications involving RNA (e.g. monitoring of gene expression, in vitro transcription for generation of RNA probes, studies of ribozymes, RNA interference, CRISPR/Cas9 mediated genome editing, selection of aptamers, etc.). Oligonucleotides with altered chemistry have proven to be of great value. Modifications, especially at 2′ position of the ribose, do not significantly change the conformation of RNA, however, substantially increase its melting temperature, resistance to ribonucleases, result in faster hybridization kinetics and greater chemical stability.
In response to rapid development of the field of RNA biology, synthesis of natural and modified RNA has become an active field. Chemical synthesis of RNA oligonucleotides can be low-yielding, time-consuming, difficult to scale and expensive. Low coupling efficiency of RNA monomers significantly limits the length of attainable RNA. Alternatively, bacteriophage T7 RNA polymerase (T7 RNAP) is the enzyme of choice for highly efficient enzymatic synthesis of RNA. However, wild type T7 RNAP is able to efficiently incorporate only natural (or canonical) ribonucleoside triphosphates (rNTPs). Some T7 RNA mutants exhibit reduced discrimination between canonical and noncanonical triphosphates; however, such discrimination is still substantial. Substitutions of multiple rNTPs with dNTPs causes significant reductions in the activity of known T7 RNA polymerases, thus limiting the use of these enzymes in the synthesis of mixed rNMP/dNMP-containing transcripts. It has also been proposed that mutations that confer new activity in an enzyme also destabilize the protein, rendering it less active overall, with low transcriptional yields (See Wang, et al., Journal of Molecular Biology 320:85-95 (2002) and Romero et al., Biotechnology and Bioengineering 103(3):472-479 (2009)).
Enzymatic synthesis of modified RNA and/or DNA oligonucleotides and polynucleotides is useful for many applications, especially those where chemically stable, non-immunogenic oligonucleotides are required. Interest in efficient enzymatic synthesis of single-stranded DNA (ssDNA) has been growing due to its broad applicability in nanotechnology, genome editing, drug delivery, data storage, and many more. Thus, there are needs for enzymes (e.g. polymerases and other enzymes) with expanded ranges of accepted nucleotide substrates and/or overall high activity. Provided herein are polymerases, in particular, mutant RNA polymerases, and related methods and compositions that can solve these needs and/or provide other benefits.
Disclosed herein are mutant RNA polymerases. These polymerases may be mutant phage-type RNA polymerases, including mutant T7, SP6, or T3 RNA polymerases. These mutant polymerases may have enhanced ability to incorporate one or more modified nucleoside triphosphates, modified rNTPs, dNTPs, and/or ddNTPs compared with the wild type RNA polymerase. Methods for producing nucleic acid molecules using these mutant polymerases are also disclosed.
In some aspects, a mutant polymerase comprises:
- a. a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 4, and wherein the mutant polymerase comprises one or more mutations at position V459, G231, V365, F222, M372, T405, P406, Q462, and/or D463 relative to SEQ ID NO: 4; or
- b. a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 5, and wherein the mutant polymerase comprises one or more mutations at position V481, G257, T391, H247, I398, L430, P431, H484, and/or D485 relative to SEQ ID NO: 5; or
- c. a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 6, and wherein the mutant polymerase comprises one or more mutations at position V459, G231, V365, F222, M372, T405, P406, Q462, and/or D463 relative to SEQ ID NO: 6.
In some aspects, the mutant polymerase further comprises an N-terminal domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with amino acid sequence corresponding to amino acids 1-324 of SEQ ID NO: 1, 1-297 of SEQ ID NO: 2, or 1-325 of SEQ ID NO: 3.
In some aspects, the mutant polymerase comprises:
- (i) a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 4,
- (ii) one or more mutations at position V459, G231, V365, F222, M372, T405, P406, Q462, and/or D463 relative to SEQ ID NO: 4, and
- (iii) an N-terminal domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with amino acid sequence corresponding to amino acids 1-324 of SEQ ID NO: 1.
In some aspects, the mutant polymerase comprises:
- (i) a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 5,
- (ii) one or more mutations at position V481, G257, T391, H247, I398, L430, P431, H484, and/or D485 relative to SEQ ID NO: 5, and
- (iii) an N-terminal domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with amino acid sequence corresponding to amino acids 1-297 of SEQ ID NO: 2.
In some aspects, the mutant polymerase comprises:
- (i) a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 6,
- (ii) one or more mutations at position V459, G231, V365, F222, M372, T405, P406, Q462, and/or D463 relative to SEQ ID NO: 6, and
- (iii) an N-terminal domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with amino acid sequence corresponding to amino acids 1-325 of SEQ ID NO: 3.
In some aspects, the catalytic domain and the N-terminal domain of mutant polymerase are covalently linked. In some aspects, the catalytic domain and the N-terminal domain of mutant polymerase are non-covalently linked.
In some aspects, a mutant polymerase comprises:
- a. at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and one or more mutations at position V783, G555, V689, F546, M696, T729, P730, Q786, and/or D787 relative to SEQ ID NO: 1; or
- b. at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 2 and one or more mutations at position V778, G554, T688, H544, I695, L727, P728, H781, and/or D782 relative to SEQ ID NO: 2; or
- c. at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 3 and one or more mutations at position V784, G556, V690, F547, M697, T730, P731, Q787, and/or D788 relative to SEQ ID NO: 3.
In some aspects, the mutant polymerase comprises:
- a. at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and one or more mutations at position V783, G555, V689, F546, M696, T729, P730, Q786, and/or D787 relative to SEQ ID NO: 1, and wherein the mutant polymerase can bind a T7 promoter; or
- b. at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 2 and one or more mutations at position V778, G554, T688, H544, I695, L727, P728, H781, and/or D782 relative to SEQ ID NO: 2, and wherein the mutant polymerase can bind a SP6 promoter; or
- c. at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 3 and one or more mutations at position V784, G556, V690, F547, M697, T730, P731, Q787, and/or D788 relative to SEQ ID NO: 3, and wherein the mutant polymerase can bind a T3 promoter.
In some aspects, the mutant polymerase comprises:
- a. at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and one or more mutations at position V783, G555, and/or V689 relative to SEQ ID NO: 1; or
- b. at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 2 and one or more mutations at position V778, G554, and/or T688 relative to SEQ ID NO: 2; or
- c. at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 3 and one or more mutations at position V784, G556, and/or V690 relative to SEQ ID NO: 3.
In some aspects, the mutant polymerase comprises:
- a. 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the catalytic domain,
- b. 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the fingers subdomain, and/or
- c. 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the palm subdomains.
In some aspects, the mutant polymerase is provided, having at least 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% of the polymerase activity of the corresponding wild type polymerase when unmodified rNTPs are used for synthesizing RNA oligonucleotides from a DNA template.
In some aspects, the mutant polymerase is provided, having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1.
In some aspects, the mutant polymerase comprises one or more of the following substitutions relative to SEQ ID NO: 1:
- a. a V783M, V783L, V783I, or V783C substitution,
- b. a G555L, G555M, G555I, G555V, or G555Y substitution,
- c. a V689Q, V689N, V689D, V689E, V689R, V689S, or V689W substitution,
- d. a F546E, F546M, or F546I substitution,
- e. a M696G or M696H substitution,
- f. a T729H, T729L, or T729R substitution,
- g. a P730Y substitution,
- h. a Q786M, Q786L, Q786N, or Q786W substitution, and/or
- i. a D787I substitution.
In some aspects, the mutant polymerase comprises two or more substitutions selected from V783M, G555L, and V689Q. In some aspects, the mutant polymerase comprises V783M, G555L, and V689Q substitutions.
In some aspects, provided herein is a method for synthesizing a single-stranded nucleic acid comprising the steps of:
- a. preparing a synthesis reaction mixture comprising:
- i. a mutant RNA polymerase having enhanced ability to incorporate one or more modified nucleoside triphosphates, modified rNTPs, dNTPs, and/or ddNTPs compared with the wild type RNA polymerase,
- ii. at least one nucleic acid template, and
- iii. a mixture of nucleoside triphosphates; and
- b. performing a synthesis reaction under conditions that result in the production of one or more single-stranded nucleic acids.
In some aspects, the mutant RNA polymerase used in said method comprises an amino acid sequence having at least 80%, 85%, 90%, 95%, 98%, or 99% sequence identity with a sequence selected from SEQ ID NOs: 1 to 6, and 25 to 43, wherein said mutant RNA polymerase comprises at least one substitution or substitution set in said amino acid sequence. In further aspects of the method, the mutant RNA polymerase comprises at least one amino acid substitution at a position corresponding to the position V783, G555, and/or V689 of SEQ ID NO:1. In further aspects, the mutant RNA polymerase is the mutant polymerase as disclosed in previous parapgraphs. In some aspects, the at least one nucleic acid template used in a method for synthesizing single-stranded nucleic acid comprises one or more promoter sequences recognized by the mutant polymerase. In some aspects, the at least one nucleic acid template comprises a T7, SP6, or T3 promoter operably linked to a target nucleotide sequence.
In some aspects, in a method of synthesizing single-stranded nucleic acid, the mixture of nucleoside triphosphates comprises one or more nucleoside triphosphates selected from modified nucleoside triphosphates, modified rNTPs, dNTPs, and ddNTPs. In some aspects, the mixture of nucleoside triphosphates comprises one or more dNTPs. In some aspects, the mixture of nucleoside triphosphates consists essentially of one or more dNTPs. In some aspects, one or more dNTPs in the mixture of nucleoside triphosphates are modified; in further aspects, one or more dNTPs are 2′-F modified.
In some aspects, in a method of synthesizing single-stranded nucleic acid, the mixture of nucleoside triphosphates comprises one or more 2′-modified rNTPs. In some aspects, the one or more 2′-modified rNTPs are selected from 2′-O-methyl, 2′-NH2, 2′-F, and 2′-methoxy ethyl rNTPs. In some aspects, the mixture of nucleoside triphosphates consists essentially of one or more 2′-modified rNTPs.
In some aspects, a mixture of nucleoside triphosphates comprises:
- a. one or more dNTPs and one or more rNTPs,
- b. three different dNTPs and one rNTP,
- c. two different dNTPs and two different rNTPs, or
- d. one dNTP and three different rNTPs.
In some aspects, one or more rNTPs are modified. In some aspects, one or more rNTPs are 2′-modified.
In some aspects, a mixture of nucleoside triphosphates comprises:
- a. dTTP, dCTP, ATP, and GTP;
- b. dTTP, CTP, ATP, and dGTP;
- c. dTTP, dCTP, dATP, and GTP;
- d. dTTP, dCTP, dATP, and 2′-F-dGTP;
- e. dUTP, dCTP, ATP, and GTP; or
- f. dUTP, dCTP, dATP, and GTP.
In some aspects, a mixture of nucleoside triphosphates further comprises one or more ddNTP. In some aspects, one or more ddNTP is modified.
In some aspects, synthesis reaction mixture further comprises a cap or cap analog.
In some aspects, the mixture of nucleoside triphosphates comprises one or more oligonucleotide-tethered nucleotide of formula (A):
Oligo is an oligonucleotide of 3 to 100 nucleotides;
- each of X and Q are independently chosen from, H, OH, N3, halo, alkyl , alkoxy, alkyl, alkenyl, alkynyl, acyl, cyano, amino, ester, and amido;
- each of Z and Y are independently chosen from a bond, amino, amido, alkyl, alkenyl, alkynyl, thioether, sulfonyl, sulfonamido, ether, ketone, carbonyl, anhydride, ester, imido, urea, urethane, and combinations thereof; and
- CXN is chosen from alkylene, alkenylene, alkynylene, ketone, carbonate, ester, ether, anhydride, amido, amino, aminoalkylene, imino, imido, diazo, carbamate ester, phosphodiester, sulfide, disulfide, sulfonyl, sulfonamido, and a heterocyclic group containing from one to four N, O, S atom(s) or a combination thereof where heterocyclic group is optionally substituted at carbon, nitrogen or sulfur atom(s).
In some aspects, CXN is Click and wherein Click is a product of a click reaction between one of the following pairs of functional groups: i) alkynyl and azido; ii) thiol and alkynyl; iii) thiol and alkenyl; iv) azido and cyclooctanyl; and v) cyclooctanyl and nitrone.
In some aspects, the mixture of nucleoside triphosphates comprises one or more oligonucleotide-tethered nucleotide of formula (I):
or a salt thereof,
- wherein X is H, N3, or OH;
- NB represents a nucleobase chosen from adenine, 7-deaza-adenine, cytosine, guanine, 7-deazaguanine, thymine, uracil and inosine;
- Z and Y are linkers, wherein Z and Y each independently comprise at least one linking moiety chosen from amino, amido, alkyl, alkenyl, alkynyl, thioether, sulfonyl, sulfonamido, ether, ketone, carbonyl, anhydride, ester, imide, urea, urethane, or any combination thereof;
- Click is the product of a click reaction; and
- Oligo is an oligonucleotide of 3 to 100 nucleotides in length.
In some aspects, X of formula (A) or formula (I) is OH. In other aspects, X of formula (A) or formula (I) is H.
In some aspects, the one or more synthesized nucleic acid synthesized by method of synthesizing single-stranded nucleic acid comprise deoxyribonucleotides and/or ribonucleotides. In some aspects, the one or more synthesized nucleic acid comprise deoxyribonucleotides. In other aspects, the one or more nucleic acid comprise deoxyribonucleotides and ribonucleotides.
In some aspects, a method of synthesizing single-stranded nucleic acid further comprises an amplification reaction, wherein at least one or more of the synthesized nucleic acids serves as a primer. In some aspects, the primer comprises from about 8 to about 200 nucleotides.
In some aspects, the one or more synthesized nucleic acid synthesized by method of synthesizing single-stranded nucleic acid comprise ribonucleotides. In some aspects, one or more nucleic acid comprise modified ribonucleotides. In some aspects, the one or more nucleic acid comprising ribonucleotides are an RNA aptamer, a ribozyme, an siRNA, an miRNA, or an antisense RNA.
In some aspects, the one or more nucleic acids synthesized by method for synthesizing single-stranded nucleic acid comprise from about 8 to about 2000 nucleotides.
In some aspects, the synthesis reaction does not require primers.
In some aspects, the amplification reaction does not require addition of primers.
In some aspects, the synthesis reaction is performed without changes in reaction temperature.
In some aspects, the method for synthesizing single-stranded nucleic acid is used for production of barcoded nucleic acid oligonucleotides, enzymatic primer synthesis, unbiased amplification of specific targets, whole genome amplification, or tagging via in vitro transcription.
In some aspects, nucleic acids obtained by the method of synthesizing single-stranded nucleic acid are used for amplicon sequencing or preparation of a sequencing library.
FIG. 1 provides a scheme of in vitro evolution of T7 RNA polymerase towards altered substrate specificity.
FIG. 2 shows mutant polymerases that have the highest enrichment (increase in relative mutation frequency) after selection. Mutant polymerases with enrichment not lower than 7X (i.e., not less than 10X lower enrichment compared to the enrichment of the Y639F mutant) were analyzed.
FIGS. 3A-3E show the activity of V783M T7 RNA polymerase mutant using different nucleotide substrates for in vitro transcription reaction. A)-E) represent specific combinations of ribo- and/or deoxyribonucleotides, as described in Example 5. Different dash types of the arrows indicate reaction products by a specific T7 RNA polymerase used: wt - wild type T7 RNA polymerase without a His tag (Thermo Scientific, EP0113); wt-His - purified wild type T7 RNA polymerase with a His tag; and V783M-His - purified T7 RNA polymerase having aV783M mutation. A peak at around 43 seconds (s) denotes a full length in vitro transcription product (42 nt); arrow with a round tip marks a marker peak (4 nt). The peaks that are between the full product peak and marker peak denote truncated in vitro transcription products. If arrows are horizontal and have no end, target full length product was not detected.
FIGS. 4A-4E show the activity of V783L T7 RNA polymerase mutants using different nucleotide substrates for in vitro transcription reaction. A)-E) represent specific combinations of ribo- and/or deoxyribonucleotides, as described in Example 5. Different dash types of the arrows indicate reaction products by different polymerases: wt - wild type T7 RNA polymerase without a His tag (Thermo Scientific, EP0113); wt-His - purified wild type T7 RNA polymerase with a His tag; and V783L-His purified T7 RNA polymerase having aV783L mutation. Peaks and labels are described above for FIGS. 3A-3E.
FIGS. 5A-5E show the activity of V689Q T7 RNA polymerase mutants using different nucleotide substrates for in vitro transcription reaction. A)-E) represent specific combinations of ribo- and/or deoxyribonucleotides, as described in Example 5. Different dash types of the arrows indicate reaction products by different polymerases: wt - wild type T7 RNA polymerase without a His tag (Thermo Scientific, EP0113); wt-His - purified wild type T7 RNA polymerase with a His tag; and V689Q - purified T7 RNA polymerase having a V689Q mutation. Peaks and labels are described above for FIGS. 3A-3E.
FIGS. 6A-6E show the activity of G555L T7 mutant RNA polymerase using different nucleotide substrates for in vitro transcription reaction. A)-E) represent specific combinations of ribo-and/or deoxyribonucleotides, as described in Example 5. Different dash types of the arrows indicate reaction products by different polymerases: wt - wild type T7 RNA polymerase without a His tag (Thermo Scientific, EP0113); wt-His - purified wild type T7 RNA polymerase with a His tag; and G555L-His purified T7 RNA polymerase having a G555L mutation. Peaks and labels are described above for FIGS. 3A-3E.
FIGS. 7A-7E show results with de novo synthesis of oligonucleotides using a variety of substrates mixes, with these mixes labeled on different lanes of the figures:
- a) UTP, CTP, ATP, and GTP;
- b) dTTP, dCTP, ATP, and GTP;
- c) dTTP, dCTP, dATP, and GTP;
- d) dTTP, dCTP, dATP, and 2′-F-dGTP;
- e) dUTP, dCTP, ATP, and GTP; and
- f) dUTP, dCTP, dATP, and GTP.
Data presented are with the following PCR mixes: (A) DreamTaq PCR Master Mix (2X); (B) Platinum IIHot-Start PCR Master Mix; (C) Platinum SuperFi PCR Master Mix; (D) Phusion Hot Start II High-Fidelity PCR Master Mix; and (E) Phusion U Multiplex PCR Master Mix. NC - a negative control reaction with no primer having SEQ ID NO: 21 added; ssDNA primer - a chemically synthesized primer having SEQ ID NO: 21 used as a positive control reaction. First and last lanes -GeneRuler DNA Ladder Mix (Thermo Scientific, SM0331).
FIG. 8 shows an exemplary scheme of using oligonucleotides synthesized by T7 RNA mutant polymerase in a reverse transcription reaction. Oligo(dT) anchored with, e.g. PCR handle 2, are synthesized during in vitro transcription. Anchored oligo(dT) are used by reverse transcriptase as primers to synthesize first strand cDNA. Optionally, first strand cDNA synthesis is randomly terminated by incorporation of oligo-tethered ddNTP (comprising, e.g. PCR handle 1) by reverse transcriptase. Reverse transcription reaction products can then be used in indexing PCR. FIG. 8 discloses SEQ ID NOS 69-71, respectively in order of appearance.
FIGS. 9A-9C show nucleic acid amplification and random termination/tagging via in vitro transcription. (A) Principal protocol of nucleic acid amplification and random termination/tagging via in vitro transcription using T7 mutant RNA polymerases; during in vitro transcription from T7 promoter, a nucleic acid with PCR handle 2 at 5′-end is synthesized, the synthesis is randomly terminated when T7 RNA polymerase incorporates oligo-tethered ddNTP; oligonucleotide in oligo-tethered ddNTP comprises, e.g. PCR handle 1. Thus, single stranded nucleic acid fragments are generated that have tag sequences at both ends; tag sequences are used for introduction of, e.g. full length sequencing adapters during further PCR step. (B) Read 1 structure of libraries generated by nucleic acid amplification and random termination/tagging via in vitro transcription; (C) Average sequencing depth (read counts) across reference. Dashed line indicates a position of T7 promoter in the reference sequence. OTDDN - oligonucleotide-tethered ddNTPs; ssDNA - single-stranded DNA.
FIG. 10. A scheme of multiplexed enzymatic primer synthesis using a T7 mutant polymerase for amplicon sequencing. The sequences of amplification primer pairs for target sequences (e.g. target A, B etc.) to be amplified are created; the amplification primers may comprise an anchor sequence (e.g. PCR handle; anchor sequence may be used for subsequent amplification step). pDNA primer denotes the part of construction oligonucleotide that hybridizes with a sequence of plasmid DNA (pDNA) for introduction of the oligonucleotide into the plasmid DNA. Then, a pool of in vitro transcription templates (e.g. plasmid DNA molecules that comprise the amplification primer sequence downstream of T7 promoter) is constructed, e.g. by PCR. Next, mutant RNAP (e.g. T7 mutant RNAP of current disclosure) is used to produce a pool of single stranded primers during the multiplexed primer synthesis from a pool of in vitro transcription templates. The enzymatically synthesized primer pool can then used in multiplex PCR, optionally followed by barcoding and sequencing in further steps.
FIG. 11 depicts an alignment of exemplary phage-type RNA polymerases. Consensus levels: high (uppercase letters) = 90%, low (lowercase letters) = 50%; Consensus symbols (all in high consensus category) are as follows: ! is any one of IV, $ is anyone of LM, % is anyone of FY, # is anyone of NDQEBZ. Amino acid positions of exemplary phage-type RNA polymerases corresponding to G555, V689, and V783 of T7 RNA polymerase (SEQ ID NO: 1) are in bold and highlighted in gray.
Table 1 provides a listing of certain sequences referenced herein.
Description of the Sequences | |||||
Description | Sequences | SEQ ID NO | |||
Wild type T7 RNA Polymerase (GenBank Acc. No. ACY75835.1) Bold font represents positions identified with mutations during selection. | MNTINIAKND | FSDIELAAIP | FNTLADHYGE | RLAREQLALE | 1 |
HESYEMGEAR | FRKMFERQLK | AGEVADNAAA | KPLITTLLPK | ||
MIARINDWFE | EVKAKRGKRP | TAFQFLQEIK | PEAVAYITIK | ||
TTLACLTSAD | NTTVQAVASA | IGRAIEDEAR | FGRIRDLEAK | ||
HFKKNVEEQL | NKRVGHVYKK | AFMQVVEADM | LSKGLLGGEA | ||
WSSWHKEDSI | HVGVRCIEML | IESTGMVSLH | RQNAGVVGQD | ||
SETIELAPEY | AEAIATRAGA | LAGISPMFQP | CVVPPKPWTG | ||
ITGGGYWANG | RRPLALVRTH | SKKALMRYED | VYMPEVYKAI | ||
NIAQNTAWKI | NKKVLAVANV | ITKWKHCPVE | DIPAIEREEL | ||
PMKPEDIDMN | PEALTAWKRA | AAAVYRKDKA | RKSRRISLEF | ||
MLEQANKFAN | HKAIWFPYNM | DWRGRVYAVS | MFNPQGNDMT | ||
KGLLTLAKGK | PIGKEGYYWL | KIHGANCAGV | DKVPFPERIK | ||
FIEENHENIM | ACAKSPLENT | WWAEQDSPFC | FLAFCFEYAG | ||
VQHHGLSYNC | SLPLAFDGSC | SGIQHFSAML | RDEVGGRAVN | ||
LLPSETVQDI | YGIVAKKVNE | ILQADAINGT | DNEVVTVTDE | ||
NTGEISEKVK | LGTKALAGQW | LAYGVTRSVT | KRSVMTLAYG | ||
SKEFGFRQQV | LEDTIQPAID | SGKGLMFTQP | NQAAGYMAKL | ||
IWESVSVTVV | AAVEAMNWLK | SAAKLLAAEV | KDKKTGEILR | ||
KRCAVHWVTP | DGFPVWQEYK | KPIQTRLNLM | FLGQFRLQPT | ||
INTNKDSEID | AHKQESGIAP | NFVHSQDGSH | LRKTVVWAHE | ||
KYGIESFALI | HDSFGTIPAD | AANLFKAVRE | TMVDTYESCD | ||
VLADFYDQFA | DQLHESQLDK | MPALPAKGNL | NLRDILES | ||
AFA |
Wild type SP6 RNA polymerase (GenBank Acc. No. AAR90000.1). Bold font represents positions aligned with positions identified with mutations during selection | MQDLHAIQLQ | LEEEMFNGGI | RRFEADQQRQ | IAAGSESDTA | 2 |
WNRRLLSELI | APMAEGIQAY | KEEYEGKKGR | APRALAFLQC | ||
VENEVAAYIT | MKVVMDMLNT | DATLQAIAMS | VAERIEDQVR | ||
FSKLEGHAAK | YFEKVKKSLK | ASRTKSYRHA | HNVAVVAEKS | ||
VAEKDADFDR | WEAWPKETQL | QIGTTLLEIL | EGSVFYNGEP | ||
VFMRAMRTYG | GKTIYYLQTS | ESVGQWISAF | KEHVAQLSPA | ||
LTQKQMPKVY | KAINALQNTQ | WQINKDVLAV | IEEVIRLDLG | ||
WQQFINWKGE | CARLYTAETK | RGSKSAAVVR | MVGQARKYSA | ||
FESIYFVYAM | DSRSRVYVQS | STLSPQSNDL | GKALLRFTEG | ||
RPVNGVEALK | WFCINGANLW | GWDKKTFDVR | VSNVLDEEFQ | ||
DMCRDIAADP | LTFTQWAKAD | APYEFLAWCF | EYAQYLDLVD | ||
EGRADEFRTH | LPVHQDSCS | GIQHYSAMLR | DEVGAKAVNL | ||
KPSDAPQDIY | GAVAQVVIKK | NALYMDADDA | TTFTSGSVTL | ||
SGTELRAMAS | AWDSIGITRS | LTKKPVMTLP | YGSRLTCRE | ||
SVIDYIVDLE | EKEAQKAVAE | GRTANKVHPF | EDDRQDYLTP | ||
GAAYNYMTAL | IWPSISEVVK | APIVAMKMIR | QLARFAAKRN | ||
EGLMYTLPTG | FILEQKIMAT | EMLRVRTCLM | GDIKMSLQVE | ||
TDIVDEAMM | GAAAPNFVHG | HDASHLILTV | CELVDKGVTS | ||
IAVIHDSFGT | HADNTLTLRV | ALKGQMVAMY | IDGNALQKLL | ||
EEHEERWMVD | TGIEVPEQGE | FDLNEIMDSE | YVFA | ||
Wild type T3 RNA polymerase (GenBank Acc. No. CAC86264.1). Bold font represents positions aligned with positions identified with mutations during selection | MNIIENIEKN | DFSEIELAAI | PFNTLADHYG | SALAKEQLAL | 3 |
EHESYELGER | RFLKMLERQA | KAGEIADNAA | AKPLLATLLP | ||
KLTTRIVEWL | EEYASKKGRK | PSAYAPLQLL | KPEASAFITL | ||
KVILASLTST | NMTTIQAAAG | MLGKAIEDEA | RFGRIRDLEA | ||
KHFKKHVEEQ | LNKRHGQVYK | KAFMQVVEAD | MIGRGLLGGE | ||
AWSSWDKETT | MHVGIRLIEM | LIESTGLVEL | QRHNAGNAGS | ||
DHEALQLAQE | YVDVLAKRAG | ALAGISPMFQ | PCVVPPKPWV | ||
AITGGGYWAN | GRRPLALVRT | HSKKGLMRYE | DVYMPEVYKA | ||
VNLAQNTAWK | INKKVLAVVN | EIVNWKNCPV | ADIPSLERQE | ||
LPPKPDDIDT | NEAALKEWKK | AAAGIYRLDK | ARVSRRISLE | ||
FMLEQANKFA | SKKAIWFPYN | MDWRGRVYAV | PMFNPQGNDM | ||
TKGLLTLAKG | KPIGEEGFYW | LKIHGANCAG | VDKVPFPERI | ||
AFIEKHVDDI | LACAKDPINN | TWWAEQDSPF | CFLAFCFEYA | ||
GVTHHGLSYN | CSLPLAFDGS | CSGIQHFSAM | LRDEVGGRAV | ||
NLLPSETVQD | IYGIVAQKVN | EILKQDAING | TPNEMITVTD | ||
KDTGEISEKL | KLGTSTLAQQ | WLAYGVTRSV | TKRSVMTLAY | ||
GSKEFGFRQQ | VLDDTIQPAI | DSGKGLMFTQ | PNQAAGYMAK | ||
LIWDAVSVTV | VAAVEAMNWL | KSAAKLLAAE | VKDKKTKEIL | ||
RHRCAVHWTT | PDGFPVWQEY | RKPLQKRLDM | IFLGQFRLQP | ||
TINTLKDSGI | DAHKQESGIA | PNFVHSQDGS | HLRMTVVYAH | ||
EKYGIESFAL | IHDSFGTIPA | DAGKLFKAVR | ETMVITYENN | ||
DVLADFYSQF | ADQLHETQLD | KMPPLPKKGN | LNLQDILKSD | ||
FAFA | |||||
Catalytic domain of wild type T7 RNA polymerase; Bold font represents positions identified with mutations during selection. | NTAWKI | NKKVLAVANV | ITKWKHCPVE | DIPAIEREEL | 4 |
PMKPEDIDMN | PEALTAWKRA | AAAVYRKDKA | RKSRRISLEF | ||
MLEQANKFAN | HKAIWFPYNM | DWRGRVYAVS | MFNPQGNDMT | ||
KGLLTLAKGK | PIGKEGYYWL | KIHGANCAGV | DKVPFPERIK | ||
FIEENHENIM | ACAKSPLENT | WWAEQDSPFC | FLAFCFEYAG | ||
VQHHGLSYNC | SLPLAFDGSC | SGIQHFSAML | RDEVGGRAVN | ||
LLPSETVQDI | YGIVAKKVNE | ILQADAINGT | DNEVVTVTDE | ||
NTGEISEKVK | LGTKALAGQW | LAYGVTRSVT | KRSVMTLAYG | ||
SKEFGFRQQV | LEDTIQPAID | SGKGLMFTQP | NQAAGYMAKL | ||
IWESVSVTVV | AAVEAMNWLK | SAAKLLAAEV | KDKKTGEILR | ||
KRCAVHWVTP | DGFPVWQEYK | KPIQTRLNLM | FLGQFRLQPT | ||
INTNKDSEID | AHKQESGIAP | NFVHSQDGSH | LRKTVVWAHE | ||
KYGIESFALI | HDSFGTIPAD | AANLFKAVRE | TMVDTYESCD | ||
VLADFYDQFA | DQLHESQLDK | MPALPAKGNL | NLRDILESDF | ||
AFA |
Catalytic domain of wild type SP6 RNA polymerase; Bold font represents positions aligned with positions identified with mutations during selection | NTQ | WQINKDVLAV | IEEVIRLDLG | YGVPSFKPLI | 5 |
DKENKPANPV | PVEFQHLRGR | ELKEMLSPEQ | WQQFINWKGE | ||
CARLYTAETK | RGSKSAAVVR | MVGQARKYSA | FESIYFVYAM | ||
DSRSRVYVQS | STLSPQSNDL | GKALLRFTEG | RPVNGVEALK | ||
WFCINGANLW | GWDKKTFDVR | VSNVLDEEFQ | DMCRDIAADP | ||
LTFTQWAKAD | APYEFLAWCF | EYAQYLDLVD | EGRADEFRTH | ||
LPVHQDGSCS | GIQHYSAMLR | DEVGAKAVNL | KPSDAPQDIY | ||
GAVAQVVIKK | NALYMDADDA | TTFTSGSVTL | SGTELRAMAS | ||
AWDSIGITRS | LTKKPVMTLP | YGSTRLTCRE | SVIDYIVDLE | ||
EKEAQKAVAE | GRTANKVHPF | EDDRQDYLTP | GAAYNYMTAL | ||
IWPSISEVVK | APIVAMKMIR | QLARFAAKRN | EGLMYTLPTG | ||
FILEQKIMAT | EMLRVRTCLM | GDIKMSLQVE | TDIVDEAAMM | ||
GAAAPNFVHG | HDASHLILTV | CELVDKGVTS | IAVIHDSFGT | ||
HADNTLTLRV | ALKGQMVAMY | IDGNALQKLL | EEHEERWMVD | ||
TGIEVPEQGE | FDLNEIMDSE | YVFA | |||
Catalytic domain of wild type T3 RNA polymerase; Bold font represents positions aligned with positions identified with mutations during selection | NTAWK | INKKVLAVVN | EIVNWKNCPV | ADIPSLERQE | 6 |
LPPKPDDIDT | NEAALKEWKK | AAAGIYRLDK | ARVSRRISLE | ||
FMLEQANKFA | SKKAIWFPYN | MDWRGRVYAV | PMFNPQGNDM | ||
TKGLLTLAKG | KPIGEEGFYW | LKIHGANCAG | VDKVPFPERI | ||
AFIEKHVDDI | LACAKDPINN | TWWAEQDSPF | CFLAFCFEYA | ||
GVTHHGLSYN | CSLPLAFDGS | CSGIQHFSAM | LRDEVGGRAV | ||
NLLPSETVQD | IYGIVAQKVN | EILKQDAING | TPNEMITVTD | ||
KDTGEISEKL | KLGTSTLAQQ | WLAYGVTRSV | TKRSVMTLAY | ||
GSKEFGFRQQ | VLDDTIQPAI | DSGKGLMFTQ | PNQAAGYMAK | ||
LIWDAVSVTV | VAAVEAMNWL | KSAAKLLAAE | VKDKKTKEIL | ||
RHRCAVHWTT | PDGFPVWQEY | RKPLQKRLDM | IFLGQFRLQP | ||
TINTLKDSGI | DAHKQESGIA | PNFVHSQDGS | HLRMTVVYAH | ||
EKYGIESFAL | IHDSFGTIPA | DAGKLFKAVR | ETMVITYENN | ||
DVLADFYSQF | ADQLHETQLD | KMPPLPKKGN | LNLQDILKSD | ||
FAFA | |||||
T7 promoter sequence | TAATACGACTCACTATAG | 7 | |||
SP6 promoter sequence | ATTTAGGTGACACTATAG | 8 | |||
T3 promoter sequence | AATTAACCCTCACTAAAG | 9 | |||
Primer for V783M mutation | ACTTTATGCACAGCCAAGAC | 10 | |||
Primer for V783M mutation | TAGGAGCGATACCAGACTCC | 11 | |||
Primer for V689Q mutation | GACGCAGGTAGCTGCGGTTG | 12 | |||
Primer for V689Q mutation | ACGCTCACAGATTCCCAAATCAG | 13 | |||
Primer for V783L mutation | ACTTTCTTCACAGCCAAGAC | 14 |
Primer for V783L mutation | TAGGAGCGATACCAGACTCC | 15 | |||
Primer for G555L mutation | AGGTACTGGGTCGCGCGGTTAAC | 16 | |||
Primer for G555L mutation | CATCTCGGAGCATCGCGGAGAAG | 17 | |||
Primers that have repetitive histidine codons | CATCACCATCACCATCAC | 18 | |||
Mutagenic primer | CATCACCATCACCATCACATGAACACGATTAACATCGCT AAG | 19 | |||
Mutagenic primer | CTCGAGCTCGGATCCCCATC | 20 | |||
42 nt oligonucleotide | GGGAAAGCTTGCATGCCTGCAGGTCGACTCTAGAGGATCCCC | 21 | |||
Control reverse primer | CAATTTCCCATTCGCCATTCAG | 22 | |||
Handle No 1 | TACACGACGCTCTTCCGATCT | 23 | |||
Handle No 2 | CAGACGTGTGCTCTTCCGATCT | 24 | |||
Protein from Enterobacteria phage 13a (GenBank Acc. No. ACF15888.1) | MNTINIAKNDFSDIELAAIPFNTLADHYGERLAREQLALEHES YEMGEARFRKMFERQLKAGEVADNAAAKPLITTLLPKMIARIN DWFEEVKAKRGKRPTAFQFLQEIKPEAVAYITIKTTLACLTSV DNTTVQAVASAIGRAIEDEARFGRIRDLEAKHFKKNVEEQLNK RVGHVYKKAFMQVVEADMLSKGLLGGEAWSSWHKEDSIHVGVRCIEMLIESTGMVSLHRQNAGVVGQDSETIELAPEYAEAIATRAGALAGISPMFQPCVVPPKPWTGITGGGYWANGRRPLALVRTHSKKALMRYEDVYMPEVYKAINIAQNTAWKINKKVLAVANVITKWKHCPVEDIPAIEREELPMKPEDIDTNPDALTAWKRAAAAVYRKDKARKSRRISLEFMLEQANKFANHKAIWFPYNMDWRGRVYAVSMFNPQGNDMTKGLLTLAKGKPIGKEGYYWLKIHGANCAGVDKVPFPERIKFIEDNHENIMACAKSPLENTWWAEQDSPFCFLAFCFEYAGVQHHGLSYNCSLPLAFDGSCSGIQHFSAMLLDEIGGRAVNLLPSETVQDIYGIVAKKVNVILQADVINGTDNEVVTVTDENTGEISEKVKLGTKALAGQWLAYGVTRSVTKRSVMTLAYGSKEFGFRQQVLEDTIQPAIDSGKGLMFTQPNQAAGYMAKLIWEAVSVTVVAAVEAMNWLKSAAKLLAAEVKDKKTGEILRKRCAVHWVTPDGFPVWQEYKKPIQTRLNLMFLGQFRLQPTINTNKDSEIDAHKQESGIAPNFVHSQDGSHLRKTVVWAHEKYGIESFALIHDSFGTIPADAANLFKAVRETMVDTYESCDVLADFYDQFADQLHESQLDKMPALPAKGNLNLQDILKSDFAFA | 25 | |||
Protein from Enterobacteria phage 285P (GenBank Acc. No. ACV32460.1) | MTNVINAPKNDFSDIANAIMPYNILADHYGAQLAATQLQLEHE AHTEGEKRFLKAMERQIKAGEFGDNAVAKPLLSSLAPKFIEAW NTWFTEVEAKRGKRPVAYNLVQKVAPEAAAFITLKVTLACLTK EEFTNLQSVATKIGRSIEDELRFGRIRDEEAKHFKNHVQEALN KRVGIVYKKAFMQAVEGKMLDAGQLQTKWTTWTPEESIHVGVRMLELLIGSTGLVELHRPFAGNVEKDGEYIQLTEQYVDLLSKRA GALAAIAPMYQPCVVPPKPWTSPVGGGYWAAGRKPLSLVRTGS KKGLERYNDVYMPEVYKAVNIAQNTPWKINKKVLAVVNEIVNW KHCPVEDVPALERGELPVKPEDIDTNEASLKAWKKAASAIYRK EKARVSRRMSMEFMLGQANKFAQFKAIWFPMNMDWRGRVYAVP MFNPQGNDMTKGLLTLAKGKPIGVDGYYWLKIHGANTAGVDKV DFAERIKFIDDNHENIMSVAADPIANTWWAEQDSPFCFLAFCF EYAGVQHHGMNYNCSLPLAFDGSCSGIQHFSAMLRDEIGGRAV | 26 |
NLLPSKEVQDIYRIVAERVNEILNQDVINGTDNEVETVTNKDT GEITEKLKLGTKELAGQWLAYGVTRKVTKRSVMTLAYGSKEYG FRDQVLEDTIQPAIDDGKGLMFTQPNQAAGYMAKLIWNAVTVT VVAAVEAMNWLKSAAKLLAAEVKDKKTKEVLRKRCAVHWVTPD GFPVWQEYKKPVQTRLNLMFLGQIRLQPTVNTNKDSGIDARKQ ESGIAPNFVHSMDGSHLRMTVVRSNEVYGVESFALIHDSFGTI PADAGNLFKAVRETMVNTYEENDVLADFYEQFADQLHESQLDK MPEMPAKGSLDLQEILKSDFAFA | |||||
Protein from Enterobacteria phage (GenBank Acc. No. BA14ACF15731.1) | MTNVINAPKNDFSDIANAIQPYNILADHYGAQLAATQLELEHE AHTEGEKRFLKAMERQIKAGEFGDNAVAKPLLSSLAPKFIEAW NTWFTEVEAKRGKRPVAYNLVQKVAPEAAAFITLKVTLACLTK EEFTNLQSVATKIGRSIEDELRFGRIRDEEAKHFKNHVQEALN KRVGIVYKKAFMQAVEGKMLDAGQLQTKWTTWTPEESIHVGVRMLELLIGSTGLVELHRPFAGNVEKDGEYIQLTEQYVDLLSKRA GALAAIAPMYQPCVVPPKPWTSPVGGGYWAAGRKPLSLVRTGSKKGLERYNDVYMPEVYKAVNIAQNTPWKINKKVLAVVNEIVNWKHCPVDDVPALERGELPIKPEDIDTNEAALKAWKKAASAIYRKEKARVSRRMSMEFMLGQANKFAQFKAIWFPMNMDWRGRVYAVPMFNPQGNDMTKGLLTLAKGKPIGVDGYYWLKIHGANTAGVDKVDFAERIKFIDDNHENIMSVAADPIANTWWAEQDSPFCFLAFCFEYAGVQHHGMKYNCSLPLAFDGSCSGIQHFSAMLRDEIGGRAVNLLPSKEVQDIYRIVAERVNEILKQDVINGTDNEVEIVTNKDTGEITEKLKLGTKELAGQWLAYGVTRKVTKRSVMTLAYGSKEYGFRDQVLEDTIQPAIDDGKGLMFTQPNQAAGYMAKLIWNAVTVTVVAAVEAMNWLKSAAKLLAAEVKDKKTKEVLRKRCAVHWVTPDGFPVWQEYKKPVQTRLNLMFLGQIRLQPTVNTNKDSGIDARKQESGIAPNFVHSMDGSHLRMTVVRSYEVYGVESFALIHDSFGTIPADAGNLFKAVRETMVNTYEENDVLADFYEQFADQLHESQLDKMPEMPAKGSLDLQEILKSDFAFA | 27 | |||
Protein from Enterobacteria phage EcoDS1 (GenBank Acc. No. ACF15785.1) | MSVISIDKHDFSDVSNAIEPFNLLADHYGQDLAVKQLQLEHEA YTEGERRFIKNLERQTERGELADNQVAKPLMQTLVPKIAQAVR EWHEGPDGKLSTSRPSVAFTMLSTEEKAVKDRSLRISCESASV IILKVILSKLVKPEGIPITPMASAIGRTLEDEIRFGRIRDKEK EHFKKAIADNLNKRAGASYKKAYMQAVETSMLEQGQLEDAWGT WSPTEAVHVGIKMLEIVIQSTQLVELKRYGAGNAAADVEMVHL SDFWVKKMAQRGFSLAGIAPVYQPCVVPPKPWTGVVGGGYWAKGRRPLPLIRLGSKSAVARYEDVYMPEVYDAVNIIQNTPWKVNKKVLEVVNMVEKLNNTPIDDIPQMEPLKPEDYAGETEEELKAWKKAAAGIYRREKARQSRRLSLSFIVNQANKFSQFKAIWFPYNMDWRGRVYAVPMFNPQGNDMQKGLLTLAVGKPIGADGFKWLKVHGANCAGIDKVTFEERIKWVEDNHDNIMATAKAPMDSIEWWGKLDSPFCFLAFCFEYAGVMHHGLSYSCSLPIAFDGSCSGIQHFSAMLRDHIGGHAVNLTPSGKVQDIYRIVSDRIEEELKVLLVNGTDNEMVTHEDKKTGEITERLKLGTRELARQWLTYGMSRKVTKRSVMTLAYGSKEYGFADQVYEDIVMPAIDSGSGAMFTEPSQASRFMAKMIWEAVSVTVVAAVDAMKWLQGAAKLLAAEVKDKKTGEILKPCLPVHWVTPDGFPVWQEYRKKDTTRLNLMFLGSFNLQPTVNKGTKKELDKHKQESGISPNFVHSQDGSHLRKTVVHTHRKYGVMSFAVIHDSFGTIPADAEYLFRGVRETMVETYRDNDVLLDFYEQFEYQLHESQRDKLPELPKKGKLNIEDILSSDFAFA | 28 |
Protein from Yersinia phage Yepe2 (GenBank Acc. No. ACF15684.1) | MTNVINAPKNDFSDIANAIQPYNILADHYGAQLAATQLELEHE AHTEGEKRFLKAMERQIKAGEFGDNTVAKPLLSSLAPKFVEAW NTWFTEVEAKRGKRPVAYNLVQKVAPEAAAFITLKVTLACLTK EEFTNLQSVATKIGRSIEDELRFGRIRDEEAKHFKNHVQEALN KRVGIVYKKAFMQAVEGKMLEAGQLHTKWTTWTPEEVIHVGVRMLELLIGSTGLVELHRPFAGNIEKDGEYIQLTEQYVDLLSKRA GALAAIAPMYQPCVVPPKPWTSPVGGGYWAAGRKPLSMVRTGSKKGLERYNDVYMPEVYKAVNIAQNTPWKINKKVLAVVNEIVSWKHCPVADVPAMERGELPVKPVDIDTNEVALKAWKKAASAIYRKEKARVSRRMSMEFMLGQANKFAQFKAIWFPMNMDWRGRVYAVPMFNPQGNDMTKGLLTLAKGKPIGVDGFYWLKIHGANTAGVDKVDFAERIKFIDDNHENIMSVAADPIANTWWTEQDSPFCFLAFCFEYAGVQHHGMNYNCSLPLAFDGSCSGIQHFSAMLRDEVGGRAVNLLPSKEVQDIYRIVAERVNEILNQDVINGTDNEVETLTNKDTGEITEKLKLGTKELAGQWLAYGVTRKVTKRSVMTLAYGSKEYGFRDQVLEDTIQPAIDDGKGLMFTQPNQAAGYMAKLIWNAVTVTVVAAVEAMNWLKSAAKLLAAEVKDKKTKEVLRNRCAVYWVTPDGFPVWQEYRKPVQTRLNLMFLGQIRLQPTVNTNKDSGIDARKQESGIAPNFVHSMDGSHLRMTVVRSYEVYGVESFALIHDSFGTIPADAGNLFKAVRETMVNTYEENDVLADFYDQFADQLHESQLDK MPEMPAKGSLDIQEILKSDFAFA | 29 | |||
Protein from Klebsiella phage K11 (GenBank Acc. No. ACF15837.1) | MNALNIARNDFSEIELAAIPYNILSEHYGDKLAREQLALEHEA YELGEQRFLKMLERQVKAGEFADNAAAKPLVLTLHPQLTKRID DWKEEQANARGKKPRAYYPIKHGVASKLAVSMGAEVLKEKRGV SSEAIALLTIKVVLGTLTDASKATIQQVSSQLGKALEDEARFG RIREQEAAY FKKNVADQLDKRVGHVYKKAFMQVVEADMISKGM LGGDNWASWKTDEQMHVGTKLLELLIEGTGLVEMTKNKMADGSDDVTSMQMVQLAPAFVELLSKRAGALAGISPMHQPCVVPPKPWVETVGGGYWSVGRRPLALVRTHSKKALRRYADVHMPEVYKAVNLAQNTPWKVNKKVLAVVNEIVNWKHCPVGDVPAIEREELPPRPDDIDTNEVARKAWRKEAAAVYRKDKARQSRRLSMEFMVAQANKFANHKAIWFPYNMDWRGRVYAVSMFNPQGNDMTKGMLTLAKGKPIGLDGFYWLKIHGANCAGVDKVPFPERIKFIEENEGNILASAADPLNNTWWTQQDSPFCFLAFCFEYAGVKHHGLNYNCSLPLAFDGSCSGIQHFSAMLRDSIGGRAVNLLPSDTVQDIYKIVADKVNEVLHQHAVNGSQTVVEQIADKETGEFHEKVTLGESVLAAQWLQYGVTRKVTKRSVMTLAYGSKEFGFRQQVLEDTIQPAIDNGEGLMFTHPNQAAGYMAKLIWDAVTVTVVAAVEAMNWLKSAAKLLAAEVKDKKTKEVLRKRCAIHWVTPDGFPVWQEYRKQNQARLKLVFLGQANVKMTYNTGKDSEIDAHKQESGIAPNFVHSQDGSHLRMTVVHANEVYGIDSFALIHDSFGTIPADAGNLFKAVRETMVKTYEDNDVIADFYDQFADQLHESQLDKMPAVPAKGDLNLRDILESDFAFA | 30 | |||
Protein from Salmonella phage phiSG-JL2 (GenBank Acc. No. ACD75668.1) | MNIIENIEKNDFSEIELAAIPFNTLADHYGSALAREQLALEHE SYELGERRFLKMLERQAKAGEIADNAAAKPLLATLLPKLTARI VEWLEEYASKKGRKPVAYAPLQLLKPEASAFITLKVILASLTS TNMTTIQAAAGMLGKAIEDEARFGRIRDLEAKHFKKHVEEQLN KRHGQVYKKAFMQVVEADMIGRGLLGGEAWSSWDKETTMHVGIRLIEMLIESTGLVELQRHNAGNAGSDHEALQLAQEYVDVLAKR AGALAGISPMFQPCVVPPKPWVAITGGGYWANGRRPLALVRTHSKKGLMRYEDVYMPEVYKAVNIAQNTAWKINKKVLAVVNEIVNWKNCPVADIPSLERQELPPKPDDIDTNEAALKEWKKAAAGVYRLDKARVSRRISLEFMLEQANKFASKKAIWFPYNMDWRGRVYAVPMFNPQGNDMTKGLLTLAKGKPIGEEGFYWLKIHGANCAGVDKVPFPERIAFIEKHVDDILACAKDPINNTWWAEQDSPFCFLAFCFEYAGVAHHGLSYNCSLPLAFDGSCSGIQHFSAMLRDEVGGRAVNLLPSETVQDIYGIVAQKVNEILKQDAINGTPNEMITVTDKD TGEISEKLKLGTSTLAQQWLAYGVTRSVTKRSVMTLAYGSKEF GFRQQVLDDTIQPAIDSGKGLMFTQPNQAAGYMAKLIWDAVSVTVVAAVEAMNWLKSAAKLLAAEVKDKKTKEILRHRCAVHWTTPDGFPVWQEYRKPLQKRLDMIFLGQFRLQPTINTLKDSGIDAHKQESGIAPNFVHSQDGSHLRMTVVYAHEKYGIESFALIHDSFGTIPADAGKLFKAVRETMVLTYENNDVLADFYDQFADQLHETQLD KMPPLPKKGKLNLQDILKSDFAFA | 31 |
Protein from Yersinia phage Berlin (GenBank Acc. No. CAJ70654.1) | MTNVINAPKNDFSDIANAIQPYNILADHYGAQLAATQLELEHE AHTEGEKRFLKAMERQIKAGEFGDNTVAKPLLSSLAPKFIEAW NTWFIDVEAKRGKRPVAYNLVQKVAPEAAAFITLKVTLACLTK EEFTNLQSVATKIGRSIEDELRFGRIRDEEAKHFKNHVQEALN KRVGIVYKKAFMQAVEGKMLDAGQLQTKWTTWTPEEVIHVGVRMLELLIGSTGLVELHRPFAGNIEKDGEYIQLTEQYVDLLSKRA GALAAIAPMYQPCVVPPKPWTSPVGGGYWAAGRKPLSMVRTGSKKGLERYNDVYMPEVYKAVNIAQNTPWKINKKVLAVVNEIVNWKHCPVADVPAMERGELPVKPVDIDTNEASLKAWKKAASAIYRKEKARVSRRMSMEFMLGQANKFAQFKAIWFPMNMDWRGRVYAVPMFNPQGNDMTKGLLTLAKGKPIGVDGFYWLKIHGANTAGVDKVDFAERIKFIEDNHENIMSVAADPIANTWWTEQDSPFCFLAFCFEYAGVQHHGMNYNCSLPLAFDGSCSGIQHFSAMLRDEVGGRAVNLLPSKEVQDIYRIVAERVNEILNQDVINGTDNEVETLTNKDTGEITEKLKLGTKELAGQWLAYGVTRKVTKRSVMTLAYGSKEYGFRDQVLEDTIQPAIDDGKGLMFTQPNQAAGYMAKLIWNAVTVTVVAAVEAMNWLKSAAKLLAAEVKDKKTKEVLRNRCAVYWVTPDGFPVWQEYRKPVQTRLNLMFLGQIRLQPTVNTNKDSGIDARKQESGIAPNFVHSMDGSHLRMTVVRSYEVYGVESFALIHDSFGTIPADAGNLFKAVRETMVNTYEENDVLADFYDQFADQLHESQLDKMPEMPAKGSLDIQEILKSDFAFA | 32 | |||
RNA polymerase from Salmonella phage Vi06 (GenBank Acc. No. CBV65202.1) | MNTISITKNDFSDIELAAIPFNTLADHYGERLAREQLALEHES YEMGEVRFRKMFERQLKAGEIADNDATKPLITTLLPKMIARIN SWFKEVQAKCGKRPTAFQFLQGIKPEAIAYITIKTTLARLTSM DNTTVQAVASAIGRAIEDEARFGRIRDLEAKHFKKNVEEQLNK RVGHVYKKAFMQVIEADMLSKGLLGGESWSSWHKEDSIHVGVRCIEMLIESTGMVSLHRQNAGVVGQDSETIELAPEYAEAIATRAGALAGISPMFQPCVVPPKPWTSISGGGYWANGRRPLALVRTHSKKALMRYADVYMPEVYKAVNIAQNTAWRINKKVLAVANVVTKWKHCPVDYIPTIEREELPMKPEDIDTNPEALASWKRAAAAVYRKDKARKSRRMSLEFMLEQANKFANHRAIWFPYNMDWRGRVYAVSMFNPQGNDMTKGLLTLAKGKPIGKEGFYWLKIHGANCAGVDKVPFPERIKFIEDNHENILACAKSPLENTWWSEQDSPFCFLAFCFEYAGGQHHGLSYNCSLPLAFDGSCFGIQHFSVMLRDEVGGRAVNLLPSETVQDIYGIVAKKVNEILQVDMINGTDNEVVTVTDDKTGEIYEKIKLGTKELAGQWLAYGVTRSVTKRSVMTLAYGSKEFGFRQQVLEDTIQPAIDSGKGLMFTHPNQAAGYMAKLIWEAVSVTVVAAVEAMNWLKSAAKLLAVEVKDRKTGEILRKRCAVHWTTPDGFPVWQEYKKPVQTRLNLIFLGQFRLQPTINTNRDSEIDAYKQESGIAPNFVHSQDGSHLRKTVVWAHEKYGIDSFALIHDSFGTIPADAANLFKAVRETMVATYESCDVLADFYAQFADQLHKSQLDK MPVLPSKGNLNLQDILKSDFAFA | 33 |
RNA polymerase from Pseudomonad phage gh-1 (GenBank Acc. No. AAO73140.1) | MTIAIPERHDFSDINSSAAFDALSNIYGPALAAEQLQLEHEAY TLGEERFHKAMERQMERGEFSNSQVAKPLLGHLVPMLSKAITD WIEHQTTKVRRKHVALGAFQQMNPETMASIVIRWTINRIAQRSGAPTITEMAVSIGGALEEEARFGRIRVLEQQHYQKHIKKALAQRNGMTYKVAYMEKVEAHMIEAGQLNEPWTEWDQSGADVRYHMGIRMLELLIESTQLIEVVREHKGNKKLDGEYVYLKAEWADKLQSRAYILSGVFPRYQPMVVPPKPWNGVRGGGYWAKGRKPVTFIRVPTKRALNRYRDVHMPEVYKAVNLAQATPWAINQKVLAVANAVMSWENVPIKEFPSTEREALPIKPGDIETNEEALKAWKKAAAGVYRKDAARVSRRLSYEFSLEQANKFAEYDAIYFPYNLDWRGRVYAIPAFNPQSNDMTKGILQAAKGEPVGKDGIEWLMIHGANCAGVDKVDFSQRKQWIKDNEEMILRCAHDPLINTDWMDMDSPFCFLAFCFEWQGVKLHGEAHVSALPIAFDGSCSGIQHFSAMLRDERGGRAVNLLQSDDVQDIYKLVSDEVEIALQWDLKYGTEDSTVLDTNEDTGEITERRVLGTKTLAMAWLTYGMSRKVTKRSVMTLAYGSKAYGFADQVREDIVKKAIDNGDGEMFTSPGEASRYMAGKIWDSVSVVVVAAVEAMNWLQKAAKLLASEVKCKKTKQVLKPAMPVYWVTPDGFPVWQEYMIPETRRIDLMFLGDVRIQATVTVRDSDKIDARKQESGISPNFVHSQDGSHLRKTVVHAAERYGIEFFALIHDSFGTIPAHAGAMFKAVRETMVETYESNNVLEDFREQFMDQLHESQLDKMPPIPEMGTLDIREILKSQFAFA | 34 | |||
RNA polymerase from Enterobacteria phage K1F (GenBank Acc. No. AAZ72968.1) | MSVISIDKHDFSDVSNAIEPFNLLADHYGQDLAVKQLQLEHEA YTEGERRFIKNLERQTERGELADNQVAKPLMQTLVPKIAQAVK EWHEGPDGKLSTSRPSVAFTMLSTEERAVKDRSLRISCESAAV IILKVILSKLVKPEGIPITPMASAIGRTLEDEIRFGRIRDKEK EHFKKAIADNLNKRAGASYKKAYMQAVEASMLEQGQLEDAWGTWSPTEAVHVGIKMLEIVIQSTQLVELKRYGAGNAAADVEMVHLSDFWVKKMAQRGFSLAGIAPVYQPCVVPPKPWTGVVGGGYWAKGRRPLPLIRLGSKSAVARYEDVYMPEVYEAVNIIQNTPWKVNKKVLDVVNMVEKLNNTPIDDIPQMEPLKPEAYAGETEEELKAWKKAAAGIYRREKARQSRRLSLSFIVNQANKFSQFKAIWFPYNMDWRGRVYAVPMFNPQGNDMQKGLLTLAVGKPIGADGFKWLKVHGANCAGVDKVTFEERIKWVEDNHDNIMAAAKAPMDSIEWWGKLDSPFCFLAFCFEYAGVMHHGLSYSCSLPIAFDGSCSGIQHFSAMLRDHIGGHAVNLTPSGKVQDIYRIVSDRIEEELKVLLVNGTDNEMVTHEDKKTGEITERLKLGTRELARQWLTYGMSRKVTKRSVMTLAYGSKEYGFADQVYEDIVMPAIDSGSGAMFTEPSQASRFMAKMIWEAVSVTVVAAVDAMKWLQGAAKLLAAEVKDKKTGEILKPCLPVHWVTPDGFPVWQEYRKKDTTRLNLMFLGSFNLQPTVNKGTKKELDKHKQESGISPNFVHSQDGSHLRKTVVHTHRKYGVMSFAVIHDSFGTIPADAEYLFRGVRETMVETYRDNDVLLDFYEQFEYQLHESQRDKLPELPKKGKLNIEDILSSDFAFA | 35 | |||
RNA polymerase from Yersinia phage phiA1122 (GenBank Acc. No. AAP20500.1) | MNTINIAKNDFSDIELAAIPFNTLADHYGERLAREQLALEHES YEMGEARFRKMFERQLKAGEVADNAAAKPLITTLLPKMIARIN DWFEEVKAKRGKRPTAFQFLQEIKPEAVAYITIKTTLACLTSA DNTTVQAVASAIGRAIEDEARFGRIRDLEAKHFKKNVEEQLNK RVGHVYKKAFMQVVEADMLSKGLLGGEAWSSWHKEDSIHVGVRCIEMLIESTGMVNLHRQNAGVVGQDSETIELTPEYAEAIATRAGALAGISPMFQPCVVPPKPWTGITGGGYWANGRRPLALVRTHSKKALMRYEDVYMPEVYKAINIAQNTAWKINKKVLAVANVITKWKHCPVEDIPAIEREELPMKPEDIDTNPEALTAWKRAAAAVYRKDKARKSRRISLEFMLEQANKFANHKAIWFPYNMDWRGRVYAVSMFNPQGNDMTKGLLTLAKGKPIGKEGYYWLKIHGANCAGVDKVPFPERIKFIEDNHENIMACAKSPLENTWWAEQDSPFCFLAFCFEYAGVQHHGLSYNCSLPLAFDGSCSGIQHFSAMLLDEVGGLAVNLLPSATVQDIYGIVAKKVNVILQADVINGTDNEVVTVTDENTGEIPEKVKLGTKALAGQWLAYGVTRSVTKRSVMTLAYGSKEFGFRQQVLEDTIQPAIDSGKGLMFTQPNQAAGYMAKLIWEAVSVTVVAAVEAMNWLKSAAKLLAAEVKDKKTGEILRKRCAVHWVTPDGFPVWQEYKKPIQTRLNLMFLGQFRLQPTINTNKDSEIDAHKQESGIAPNFVHSQDGSHLRKTVVWAHEKYGIESFALIHDSFGTIPADAANLFKAVRETMVDTYESCDVLADFYDQFADQLHESQLDKMPALPAKGNLNLQDILKSDFAFA | 36 |
RNA polymerase from Yersinia phage phiYeO3-12 (GenBank Acc. No. CAB63592.1) | MNIIENIEKNDFSEIELAAIPFNTLADHYGSALAREQLALEHE SYELGERRFLKMLERQAKAGEIADNAAAKPLLATLLPKLTTRI VEWLEEYATKKGRKPVAYAPLQSLKPEASAFITLKVILASLTS TNMTTIQAAAGMLGKAIEDEARFGRIRDLEAKHFKKHVEEQLN KRHGQVYKKAFMQVVEADMIGRGLLGGEAWSSWDKETTMHVGIRLIEMLIESTGLVELQRHNAGNAGSDHEALQLAQEYVDVLAKRAGALAGISPMFQPCVVPPKPWVAITGGGYWANGRRPLALVRTHSKKGLMRYEDVYMPEVYKAVNIAQNTAWKINKKVLAVVNEIVNWKNCPVADIPSLERQELPPKPDDIDTNEAALKEWKKAAAGIYRLDKARVSRRISLEFMLEQANKFASKKAIWFPYNMDWRGRVYAVPMFNPQGNDMTKGLLTLAKGKPIGEEGFYWLKIHGANCAGVDKVPFPERIAFIEKHVDDILACAKDPINNTWWAEQDSPFCFLAFCFEYAGVAHHGLSYNCSLPLAFDGSCSGIQHFSAMLRDEVGGRAVNLLPSETVQDIYGIVAQKVNEILKQDAINGTPNEMITVTDKDTGEISEKLKLGTSTLAQQWLAYGVTRSVTKRSVMTLAYGSKEFGFRQQVLDDTIQPAIDSGKGLMFTQPNQAAGYMAKLIWDAVSVTVVAAVEAMNWLKSAAKLLAAEVKDKKTKEILRHRCAVHWTTPDGFPVWQEYRKPLQKRLDMIFLGQFRLQPTINTLKDSGIDAHKQESGIAPNFVHSQDGSHLRMTVVYAHENYGIESFALIHDSFGTIPADAGKLFKAVRETMVITYENNDVLADFYDQFADQLHETQLDKMPPLPKKGNLNLQDILKSDFAFA | 37 | |||
RNA polymerase from Kluyvera phage Kvp1 (GenBank Acc. No. ACJ14548.1) | MNVINAPKNDFSDIANAIQPYNILADHYGAQLAATQLELEHEA HTEGEKRFLKAMERQIKAGEFGDNAVAKPLLSSLAPKFIEAWN TWFTEVEAKRGKRPVAYNLVQKVAPEAAAFITLKVTLACLTKE EFTNLQSVATKIGRSIEDELRFGRIRDEEAKHFKNHVQEALNK RVGIVYKKAFMQAVEGKMLDAGQLQTKWTTWTPEESIHVGVRMLELLIGSTGLVELHRPFAGNVEKDGEYIQLTEQYVDLLSKRAG ALAAIAPMYQPCVVPPKPWTSPVGGGYWAAGRKPLSLVRTGSK KGLERYNDVYMPEVYKAVNIAQNTPWKINKKVLAVVNEIVNWK HCPVEDVPALERGELPVKPEDIDTNEAALKAWKKAASAIYRKE KARVSRRMSMEFMLGQANKFAQFKAIWFPMNMDWRGRVYAVPMFNPQGNDMTKGLLTLAKGKPIGVDGYYWLKIHGANTAGVDKVDFAERIKFIDDNHENIMSVAADPIANTWWAEQDSPFCFLAFCFEYAGVQHHGMNYNCSLPLAFDGSCSGIQHFSAMLRDEVGGRAVNLLPSKEVQDIYRIVAERVNEMLREAVINGTDNEVETVTNKDTGEITEKLKLGTKELAGQWLAYGVTRKVTKRSVMTLAYGSKEYGFRDQVLEDTIQPAIDDGKGLMFTQPNQAAGYMAKLIWESVTVTVVAAVEAMNWLKSAAKLLAAEVKDKKTKEVLRKRCAVHWVTPDGFPVWQEYKKPVQTRLNLMFLGQIRLQPTVNTNKDSGIDARKQESGIAPNFVHSMDGSHLRMTVVRSNEVYGVESFALIHDSFGTIPADAGNLFKAVRETMVNTYEENDVLADFYEQFADQLHESQLDKMPEMPAKGSLDLQEILKSDFAFA | 38 |
RNA polymerase from Morganella phage MmP1 (GenBank Acc. No. ACY74627.1) | MSIAAAVNKNDFSDVELAAIPFNTLADHYGADLAREQLQLEHE SYVMGEERFRKMLERQEKAEEFGDSSVAKPLIITLLPKVTQRI TDWLNEWADPNKKGRKPIAYTHLKDIKPETLAFITIKVVLNKL AGKDDAFMQPLAYAIGSSIEDEARFGRIRELEMAHFKKCAEEN LNKRRGTAYRKAFLSVVEADMLDKGLLGGESWGTWNKTDVMNIGISMLEKLIEATGLVELREKRNFEEMDRIVIAEEYVKAMATRA QSLAGISPMYQPCVVPPKPWVSITGGGYWANGRKPTALIRTHTRKALYRYEDVYMPEVYKAINYAQETPWRINRKVLAVVNELVKWKNNPVKDMPSIDKLELPQRPDDIDTNEEALRSWKREAAAVYRKDEQRKSRYLSMSFALEQANKFSNKKAIYFPYNMDWRGRVYALPMFNPQGNDMVKGLLTLAKGKPIGKDGFYWLKIHGANTAGVDKVTFPERIKFIEDNHDNIMQCAESPLDNLWWTEQDSPFCFLAFCFEYAQVTKKGLGWVCSLPIALDGSCSGIQHFSAMLRDDIGGRAVNLLPSETVQDIYGIVADKVNEALKELVINGTDNYTDTVTDKSTGEIIERYRLGEKELARQWLEFGVTRSVTKRSVMTLAYGSKEYGFRDQVLEDTIRPAIDSGKGAMFTNPSQAASFMAKRIWEAVSVTVVAAVGAMKWLQSSAKLMAAEVKDKKTKEVLRKRCAVHWVTPDGFPVWQEYRKPKQKRVHLMFLGSYYDARMKETSSDCSIDAHKQESGISPNFVHSQDGNHLRMTVVYAREKYNVESFALIHDSFGTIPADVPNLFKAVRETMVNMYENNDVLADFYEQFADQLHESQLDKMPALPPKGKLNLQDILKSDFAFA | 39 | |||
RNA polymerase from Vibrio phage N4 (GenBank Acc. No. ACR16468.1) | MANVIKPESHNFSDISAAILPFNVLADSYGEALAAEQLMLEHE SYQLGEARFIKAMERQVERGEVSDNAVAKPLLDTLIPALAARI TEFVEMKQRGKPHVSKGYFAMIKPESAAFIIVKTTLNILAKEE SVPVQRVAMAIGGNIEDEIRFGRIRDEEIKHFKERVKPNLDKR NGFIYKKAYMEAVEAGMQDKGELNSTHEAWEKDVKFHVGIRAIEMLIEATGMVQLERKFKGIPDKDHEALHLAPEYVEKLTNRAHA LAGISPMYQPMIVKPKRWTGVQGGGYWAKGRRPLNLIRVGSKRALDRYRQVDMPEVYDAINTIQETAWRINKDVLAVVNNVVTWANCPVEDVPSIDKLALPEKPEDIDSNEESLKKWKKAAAAIYRKEK ARQSRRISLEFALSQANKFSKYNEIYFPYNMDWRGRVYAIPMF NPQGNDMVKGLLTFAKKVPVGIDGGYWLAVHGANCAGVDKVSLEDRVKWVNDNEANILASAEAPLDFTWWAEQDSPFCFLAFCFEWAAYVKAGKKPSFESSLPLAFDGTCSGLQHFSAMLRDEIGGAAVNLLPADKPQDIYGIVAVKVNEVLRDLVISGTEDEMQTLEDKKT GEITERLVLGTRTLAAQWLEYGVTRSVTKRSVMTLAYGSKEYG FADQVFEDTVMPAIDNGKGAMFTEPSQACRFMAKLIWDAVSKT VVAAVEAMQWLQSAAKLVSSEVKDKKSGEILKHAMPVHWTTPN GFPVWSEYCKQEQKRIDCVILGTHRMALTINIRDKKEIDAAKQ TSGIAPNFVHSMDASHLQMTVNKCFKVYGIHSFAMIHDSFGCH AGFASKMFRAVRETMVETYEEHDVIQEFYNQFEQQLHESQIEK MPVLPRKGNLELREILKSLYTFS | 40 | |||
RNA polymerase from Vibriophage VP4 (GenBank Acc. No. AAY46276.1) | MANVIKPQSHNFSDISAAILPFNVLADSYGEALAAEQLMLEHE SYQLGEARFIKAMERQVERGEVSDNAVAKPLLDTLIPALAARI TEFVEMKQRGKPHVSKGYFAMIKPESAAFIIVKTTLNILAKEE SVPVQRVAMAIGGNIEDEIRFGRIRDEEIKHFKERVKPNLDKR NGFIYKKAYMEAVEAGMQDKGELNSTHEAWEKDVKFHVGIRAIEMLIEATGMVQLERKFKGIPDKDHEALHLAPEYVEKLTNRAHA LAGISPMYQPMIVKPKRWTGVQGGGYWAKGRRPLNLIRVGSKRALDRYRQVDMPEVYDAINTIQETAWRINKDVLAVVNNVVTWTNCPVEDVPSIDKLALPEKPEDIDNNEESLKKWKKAAAAIYRKEK ARQSRRISLEFALSQANKFSKYNEIYFPYNMDWRGRVYAIPMF NPQGNDMVKGLLTFAKKVPVGIDGGYWLAVHGANCAGVDKVSLEDRVKWVNDNEANIIASAEAPLDFTWWAEQDSPFCFLAFCFEWAAYVKAGKKPSFESSLPLAFDGTCSGLQHFSAMLRDEIGGAAVNLLPADKPQDIYGIVAVKVNEVLRDLVISGTEDEMQTLEDKKT GEITERLVLGTRTLAAQWLEYGVTRSVTKRSVMTLAYGSKEYG FADQVFEDTVMPAIDNGKGTMFTEPSQACRFMAKLIWDAVSKT VVAAVEAMQWLQSAAKLVSSEVKDKKSGEILKHAMPVHWTTPN GFPVWSEYCKQEQKVIDCVILGSMRLQLKLNMRDKKEIDTAKQ ASGIAPNFVHSMDASHLQMTVNKCFKVYGIHSFAMIHDSFGCH AGFASKMFRAVRETMVETYEEHDVIQEFYNQFEKQLHESQIEK MPALPRKGNLELREILKSLYTFS | 41 |
Protein from Enterobacteria phage K11 (GenBank Acc. No. CAA37330.1) | MNALNIGRNDFSEIELAAIPYNILSEHYGDQAAREQLALEHEA YELGRQRFLKMLERQVKAGEFADNAAAKPLVLTLHPQLTKRID DWKEEQANARGKKPRAYYPIKHGVASELAVSMGAEVLKEKRGV SSEAIALLTIKVVLGNAHRPLKGHNPAVSSQLGKALEDEARFG RIREQEAAYFKKNVADQLDKRVGHVYKKAFMQVVEADMISKGMLGGDNWASWKTDEQMHVGTKLLELLIEGTGLVEMTKNKMADGSDDVTSMQMVQLAPAFVELLSKRAGALAGISPMHQPCVVPPKPWVETVGGGYWSVGRRPLALVRTHSKKALRRYADVHMPEVYKAVNLAQNTPWKVNKKVLAVVNEIVNWKHCPVGDVPAIEREELPPRPDDIDTNEVARKAWRKEAAAVYRKDKARQSRRCRCEFMVAQANKFANHKAIWFPYNMDWRGRVYAVSMFNPQGNDMTKGSLTLAKGKPIGLDGFYWLKIHGANCAGVDKVPFPERIKFIEENEGNILASAADPLNNTWWTQQDSPFCFLAFCFEYAGVKHHGLNYNCSLPLAFDGSCSGIQHFSAMLRDSIGGRAVNLLPSDTVQDIYKIVADKVNEVLHQHAVNGSQTVVEQIADKETGEFHEKVTLGESVLAAQWLQYGVTRKVTKRSVMTLAYGSKESLVRQQVLEDTIQPAIDNGEGLMFTHPNQAAGYMAKLIWDAVTVTVVAAVEAMNWLKSAAKLLAAEVKDKKTKEVLRKRCAIHWVTPDGFPVWQEYRKQNQARLKLVFLGQANVKMTYNTGKDSEIDAHKQESGIAPNFVHSQDGSHLRMTVVHANEVYGIDSFALIHDSSGTIPADAGNLFKAVRETMVKTYEDNDVIADFYDQFADQLHESQLDKMPAVPAKGDLNLRDILESDFAFA | 42 | |||
RNA polymerase from Synechococcus virus Syn5 (GenBank Acc. No: YP_001285424.1); Bold font represents position V693 that aligns and corresponds to position V783 of T7 RNA polymerase | MSFDLIARQLQRETEAAELARKRLQDARREANERSYASSNIES RKAIATFLDPIAQRIGERLFTLRRGTGAVDAAEVYKHLKNADH HHLALITMKTALDVLGKDPEPQIQQLTTAIGRNIQLELRLTYY AEENPELYKQASRFFHAGTGTRQKATVIKLKFNREGIEWDQWS RVTCHKVGQWLMLAMADVTGWIERATDRTSGGRKTKTRICYSREFLQHRDTILAAAEQLAFCQWPMLCPPIEWSNDHNGGYLSEQIRRVNPLIRKTGPLGTRKQGDIPLAMLNNLQGQAYKVNPEVLDIANHCYESNVTVGKFIRHAPLPVPPSPGEDCTEDQLTAYKRARREAEDFNAQISQKNWRTTEVMYVARKYADEASFWMPASFDYRGRVYFLNTALNPQGTDFDKALLYFAEEGPVNEWWLSFHVATTYGLDKETMVNRVQWARDNHELIDRIASDPVRHTEWHDADEPWCFLAACLEYKACVIDGTKQTSGLPIGIDATCSGLQHLAAMTRCGRTAALVNVTPTDKPADAYKTVAQASLKHLPKEQHEWITRKVTKRPVMCTPYGVTMSSARGYIRDQLVKDGHKEDLRSPGVLNGIVKAIFNEAIPEVIPGPVQVMAWLKRSAGQIIDRGDSTITWTTPSGFEVVQDLKKSKTYEVKTRIMGGARIKLQVGDGFTDEPDRDHHKSALAPNWHSNDASLLHLTFAFWDKPFTVIHDCVLGRSCDMDQMGSDIRLHFAEMYKADVMQDWADQVGVELPVDLIKNTLDIDSVNQSLYFFS | 43 | |||
T7 promoter sequence with 3′ -GG | TAATACGACTCACTATAGGG | 44 |
Anchored Oligo(dT) | GGGCAGACGTGTGCTCTTCCGATCTTTTTTTTTTTTTTTTTTT TTTTTTVN, wherein V is A or G, and N is A, T, C or G | 45 | |||
Oligonucleotide tethered to ddNTP | NNNNNNNNAGATCGGAAGAGCGTCGTGTA-biotin, wherein N is A, T, C or G | 46 | |||
Fw primer for gene Z | TACACGACGCTCTTCCGATCTCAATGGCCATTAACCGCGTTG | 47 | |||
Rev primer for gene Z | CAGACGTGTGCTCTTCCGATCTACCTTTCAGGGATGAACGCTG | 48 | |||
Fw primer for gene U | TACACGACGCTCTTCCGATCTGCAGTTGCCGTTTATCTCACC | 49 | |||
Rev primer for gene U | CAGACGTGTGCTCTTCCGATCTAACTCCACAAGCCCGCATCAT | 50 | |||
Fw primer for gene V | TACACGACGCTCTTCCGATCTCGAATCCGCTTTCAGACGTTG | 51 | |||
Rev primer for gene V | CAGACGTGTGCTCTTCCGATCTGGGTATCGCCTTCATTAAACC | 52 | |||
Fw primer for gene G | TACACGACGCTCTTCCGATCTTGATGAAACGGCAGGCAGAAC | 53 | |||
Rev primer for gene G | CAGACGTGTGCTCTTCCGATCTTACATACCAGACAGCCGGTAC | 54 | |||
Fw primer for gene T | TACACGACGCTCTTCCGATCTCTGCTGGATATGCACTTTTCC | 55 | |||
Rev primer for gene T | CAGACGTGTGCTCTTCCGATCTCCCTTCTGATACTGTCATCAG | 56 | |||
Fw primer for gene H | TACACGACGCTCTTCCGATCTGATATTGGTCGTCCTGATACC | 57 | |||
Rev primer for gene H | CAGACGTGTGCTCTTCCGATCTTCGGTATATTTCAGCCGTGAC | 58 | |||
Fw primer for gene M | TACACGACGCTCTTCCGATCTGCTTTGGTGATGGCTATTCTC | 59 | |||
Rev primer for gene M | CAGACGTGTGCTCTTCCGATCTCAGTTCACCACCTGTTCAAAC | 60 | |||
Fw primer for gene L | TACACGACGCTCTTCCGATCTTTTACGCCCGTTTTCTGGATG | 61 | |||
Rev primer for gene L | CAGACGTGTGCTCTTCCGATCTGACGTTGGCTGGTCATATTCA | 62 | |||
Fw primer for gene K | TACACGACGCTCTTCCGATCTCGATTCATAAGTTCCGCTGTG | 63 | |||
Rev primer for gene K | CAGACGTGTGCTCTTCCGATCTTGATTCGGCACTGATGAACCA | 64 | |||
Fw primer for gene I | TACACGACGCTCTTCCGATCTTTACAACGATTTGGTCGCCGC | 65 | |||
Rev primer for gene I | CAGACGTGTGCTCTTCCGATCTGACAATCTGGAATACGCCACC | 66 | |||
Fw primer for gene J | TACACGACGCTCTTCCGATCTAGCGTGAAAGCAGTGTGGACT | 67 | |||
Rev primer for gene J | CAGACGTGTGCTCTTCCGATCTCCGCTGGCATGTCAACAATAC | 68 |
As used herein, a “mutant” of an RNA polymerase (RNAP) refers to an RNAP that comprises one or more modifications of the amino acid sequence, for example by substitution, deletion, insertion or chemical modification, wherein such modifications change the functionality of the RNAP. This change in functionality may be an increase or decrease in a given function. A mutant polymerase may also have additional amino acid modifications that do not alter function of the protein or peptide. A mutant may comprise one or more amino acids have been replaced by their respective D-stereoisomers or by amino acids other than the naturally occurring 20 amino acids, such as, for example, ornithine, hydroxyproline, citrulline, homoserine, hydroxylysine, or norvaline.
As used herein, a “fragment” refers to an N-terminally and/or C— terminally shortened polypeptide, i.e. a polypeptide that lacks one or more of the N— terminal and/or C-terminal amino acids. Usually, the fragments are still functional, i.e. retain the biologic activity of the full-length polypeptide at least to a certain extent. The fragments may be at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, or at least 700 amino acids long and retain the polymerase activity of the protein. As used herein, a RNAP may comprise a fragment of a RNAP, such as a fragment comprising the catalytic domain of a given RNAP.
The terms “identical” or percent “identity,” in the context of two or more polypeptide sequences, refer to two or more sequences or subsequences that have a homology of 100% when aligned. Sequences are “X% identical” if two sequences have a specified X percentage of amino acid residues that are the same (i.e., X% may be 80%, 85%, 90%, or 95% identity over a specified region, or, when not specified, over the entire sequence), when compared and aligned for maximum correspondence over a comparison window, designated region as measured using one of the sequence comparison algorithms or by manual alignment and visual inspection. A “comparison window”, as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman (1970) Adv. Appl. Math. 2:482c, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443, by the search for similarity method of Pearson and Lipman (1988) Proc. Nat’l. Acad. Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Ausubel et al., Current Protocols in Molecular Biology (1995 supplement)). Two examples of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1977) Nuc. Acids Res. 25:3389-3402, and Altschul et al. (1990) J. Mol. Biol. 215:403-410, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/).
“Biological activity” or the property of being “functional,” as used herein, refers to an enzymatic activity of a polypeptide, an interaction with another molecule, or a cellular localization of a polypeptide. The functional or biological activity of polypeptide may be 20 or more, 30 or more, 40 or more, 50 or more, 60 or more, 70 or more, 80 or more, 90 or more, or 100 percent, or greater than 100 percent of the activity of an appropriate reference, e.g. a wildtype polypeptide.
The term “polymerase activity,” as used herein, relates to the enzymatic functionality of a wt and mutant T7 RNA polymerase and means that a T7 RNA polymerase (T7 RNAP) is capable of synthesizing a nucleic acid molecule from substrate nucleotides that may be wild type (i.e. canonical, ribo-) nucleotides and/or modified/non-canonical nucleotides. Polymerase activity is also considered to be present, if a T7 RNAP can use only one specific modified nucleotide as a substrate and incorporate it into the nucleic acid being synthesized and/or only produces short molecules of only 2-10 nucleotides.
A “non-canonical” nucleotide or substrate, as used herein, relates to any nucleotides other than ribonucleotides (rNTP) conventionally used (i.e. that are canonical) as substrate by WT T7 RNAP. Non-canonical substrates include deoxynucleotides (dNTP) and 2′-modified rNTPs, such as 2′-methoxy or 2′ -F modified rNTPs. Non-canonical nucleotides also include oligonucleotide-tethered nucleotides.
A “promoter,” as used herein, is a regulatory nucleotide sequence that stimulates transcription. For example, multiple RNA copies can be transcribed from a DNA template that includes a functional promoter recognized by RNA polymerase. Such a DNA template comprises a promoter operably linked to a target nucleotide sequence to be transcribed.
The term “operably linked,” as used herein, refers to the association of two or more nucleic acid fragments on a single vector so that the function of one is affected by the other. For example, a promoter may be operably linked with a target nucleic acid sequence, wherein the promoter can affect the expression (e.g. in an in vitro transcription) of that target sequence. In this example, the target nucleic acid sequence is under the transcriptional control of the promoter.
A “target nucleic acid,” as used herein, is a nucleic acid sequence of interest. A target nucleic acid may be a specific sequence (for example, a specific gene) or target nucleic acids may be sequences from a whole genome. A target nucleic acid may be comprised in a biological sample.
A “template nucleic acid,” as used herein, is a nucleic acid molecule that comprises a promoter operably linked to a target nucleotide sequence to be transcribed. A target nucleotide sequence portion of a template nucleic acid may be double stranded or single stranded, as long as a promoter portion is double stranded. A template nucleic acid may be a DNA or RNA molecule. A template may be part single-stranded and part double-stranded.
As used herein, the term “amplification” relates to the production of additional copies of a nucleic acid molecule. Amplification as used herein is often carried out using polymerase chain reaction (PCR) technologies well known in the art, but may also be carried out by other means including isothermal amplification methods such as, e.g., transcription mediated amplification, strand displacement amplification, rolling circle amplification, loop-mediated isothermal amplification, helicase dependent amplification, single primer isothermal amplification or recombinase polymerase.
This application is related to mutant polymerases with improved ability to use 2′-modified ribonucleoside triphosphates (rNTPs) or deoxynucleotide triphosphates (dNTPs) as substrates. In some instances, the mutant polymerase is a phage-type RNA polymerases (RNAP), such as T7, SP6, or T3 RNAP.
T7 RNA polymerase (which may be referred to as “wild type T7 RNA polymerase or “WT T7 RNAP) is a DNA-directed RNA polymerase of bacteriophage T7 (enterobacteria phage T7) with the UniProtKB/Swiss-Prot Accession No. P00573 (version 98 of the entry and version 2 of the sequence) and as set forth in SEQ ID NO: 1. WT T7 RNAP also includes isoforms of this protein, in particular naturally occurring isoforms. WT T7 RNAP also includes variants of this protein, which comprise one or more change in amino acid sequence that have little or no effect on the function of the RNAP. The polypeptide is encoded by nucleotides 3171 to 5822 of the T7 bacteriophage genome. Examples of related RNAPs include SP6 RNAP as set forth in SEQ ID NO: 2 and T3 RNA RNAP as set forth in SEQ ID NO: 3.
As used herein, a “phage-type RNA polymerase” or a “phage-type RNAP” refers to a polymerase with homology to T7 RNAP and which can incorporate rNTPs into RNA. Exemplary phage-type polymerases include RNAPs from the phage-type phages T7, T3, SP6, ΦI, ΦII, W31, H, Y, A1, 122, cro, C21, C22, and C23; gh-1, IV, ViIII or 11, or a mitochondrial RNAP, as well as derivative and mutant forms of these RNAPs. A “phage-type RNAP promoter” is a promoter from which transcription is initiated by a phage-type RNAP. In an aspect, the T7, SP6, or T3 RNAP initiates synthesis of a single-stranded nucleic acid molecule via the T7, SP6, or T3 promoter, respectively. In some aspects, the promoter sequence of the T7, SP6, and T3 promoters comprises SEQ ID NOs: 7, 8 and 9, respectively. In some instances, the promoter sequence of the T7, SP6, and T3 promoters is SEQ ID NOs: 7, 8 and 9, respectively. In some instances, the T7 promoter comprises SEQ ID NO: 7 with additional G or GG at 3′-end. In some instances, the T7 promoter comprises SEQ ID NO: 44. Other phage-type RNA polymerases initiate synthesis of nucleic acid molecule via an appropriate promoter.
Suitable exemplary phage-type RNA polymerases include, but are not limited to, RNA polymerases from Escherichia phage T7 (T7 RNA polymerase, Genbank: ACY75835.1; SEQ ID NO: 1), Salmonella virus SP6 (SP6 RNA polymerase, GenBank: AAR90000.1; SEQ ID NO: 2), Enterobacteria phage T3 (T3 RNA polymerase, GenBank: CAC86264.1; SEQ ID NO: 3), Enterobacteria phage 13a (GenBank: ACF15888.1; SEQ ID NO: 25), Enterobacteria phage 285P (GenBank: ACV32460.1; SEQ ID NO: 26), Enterobacteria phage BA14 (GenBank: ACF15731.1; SEQ ID NO: 27), Enterobacteria phage EcoDS1 (GenBank: ACF15785.1; SEQ ID NO: 28), Yersinia phage Yepe2 (GenBank: ACF15684.1; SEQ ID NO: 29), Klebsiella phage K11 (GenBank: ACF15837.1; SEQ ID NO: 30), Salmonella phage phiSG-JL2 (GenBank: ACD75668.1; SEQ ID NO: 31), Yersinia phage Berlin (GenBank: CAJ70654.1; SEQ ID NO: 32), Salmonella phage Vi06 (GenBank: CBV65202.1; SEQ ID NO: 33), Pseudomonad phage gh-1 (GenBank: AAO73140.1; SEQ ID NO: 34), Enterobacteria phage K1F (GenBank: AAZ72968.1; SEQ ID NO: 35), Yersinia phage phiA1122 (GenBank: AAP20500.1; SEQ ID NO: 36), Yersinia phage phiYeO3-12 (GenBank: CAB63592.1; SEQ ID NO: 37), Kluyvera phage Kvp1 (GenBank: ACJ14548.1; SEQ ID NO: 38), Morganella phage MmP1 (GenBank: ACY74627.1; SEQ ID NO: 39), Vibrio phage N4 (GenBank: ACR16468.1 (SEQ ID NO: 40), AAY46276.1 (SEQ ID NO: 41)), Enterobacteria phage K11 (GenBank: CAA37330.1; SEQ ID NO: 42), and Synechococcus virus Syn5 (GenBank Acc. No: YP_001285424.1; SEQ ID NO: 43).
In some aspects, a phage-type RNAP has at least 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% sequence identity to SEQ ID NO: 1. For example, Table 1 of US9062292, which is incorporated in its entirety herein by reference, provides a listing of known RNA polymerases, including the homology of those RNA polymerases to T7 (SEQ ID NO: 1). SP6 and T3 are exemplary phage-type RNA polymerases.
Positions in various RNAPs that correspond to positions in T7 RNAP can be determined by alignment of amino acid sequences. Alignment of amino acid sequences can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, ALIGN, Clustal Omega or MultiAlin software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
Phage-type RNAPs have been widely studied, and a variety of domains have been described for these enzymes. Phage-type RNAPs comprise a catalytic domain, N-terminal domain, fingers subdomain, palm 1 subdomain, and palm 2 subdomain. The palm 1 subdomain and palm 2 subdomain may be referred to as the palm subdomains.
The fingers subdomain functions to bind the nucleoside triphosphates with the template base, while the palm subdomains functions to coordinate the nucleoside triphosphates with the template.
The N-terminal domain comprises multiple subdomains. N-terminal domain is involved in promoter recognition and DNA strand separation via the participation of an AT-rich recognition loop and an intercalating beta-hairpin loop. Three regions, the specificity loop (amino acids 739-770 of T7 RNAP as disclosed in SEQ ID NO: 1), the AT-rich recognition loop, and the intercalating beta-hairpin loop, can interact with the promoter.
The AT-rich recognition loop, which comprises residues 93-101 of SEQ ID NO: 1, indirectly binds through its inherent flexibility to the DNA sequence in the minor groove. The intercalating beta-hairpin loop plays an important role in transcription initiation: formation and stabilization of the transcription bubble. It facilitates strand separation of the promoter duplex so that the template strand can be accessible to the binding site of the T7 RNA polymerase.
In some instances, a phage-type RNAP comprises a N-terminal domain, a catalytic domain, a fingers subdomain, and palm subdomains.
In some aspects, a T7 RNAP comprises a N-terminal domain, a catalytic domain, a fingers subdomain, and palm subdomains. In some aspects, a T7 RNAP comprises SEQ ID NO: 1. In further aspects, the catalytic domain of a T7 RNAP comprises amino acids 325-883 of SEQ ID NO: 1. In some aspects, the N-terminal domain of a T7 RNAP comprises amino acids 1-324 of SEQ ID NO: 1. In some aspects, the fingers subdomain of a T7 RNAP comprises amino acids 566-784 of SEQ ID NO: 1. In some aspects, the palm 1 subdomain of a T7 RNAP comprises amino acids 325-411 of SEQ ID NO: 1. In some aspects, the palm 2 subdomain of a T7 RNAP comprises amino acids 785-883 of SEQ ID NO: 1.
In some aspects, a SP6 RNAP comprises a N-terminal domain, a catalytic domain, a fingers subdomain, and palm subdomains. In some aspects, a SP6 RNAP comprises SEQ ID NO: 2. In some aspects, the catalytic domain of a SP6 RNAP comprises amino acids 298-894 of SEQ ID NO: 2. In some aspects, the N-terminal domain of a SP6 RNAP comprises amino acids 1-297 of SEQ ID NO: 2. In some aspects, the fingers subdomain of a SP6 RNAP comprises amino acids 565-779 of SEQ ID NO: 2. In some aspects, the palm 1 subdomain of a SP6 RNAP comprises amino acids 298-401 of SEQ ID NO: 2. In some aspects, the palm 2 subdomain of a SP6 RNAP comprises amino acids 780-984 of SEQ ID NO: 2.
In some aspects, a T3 RNAP comprises a N-terminal domain, a catalytic domain, a fingers subdomain, and palm subdomains. In some aspects, a T3 RNAP comprises SEQ ID NO: 3. In some aspects, the catalytic domain of a T3 RNAP comprises amino acids 326-884 of SEQ ID NO: 3. In some aspects, the N-terminal domain of a T3 RNAP comprises amino acids 1-325 of SEQ ID NO: 3. In some aspects, the fingers subdomain of a T3 RNAP comprises amino acids 567-785 of SEQ ID NO: 3. In some aspects, the palm 1 subdomain of a T3 RNAP comprises amino acids 326-412 of SEQ ID NO: 3. In some aspects, the palm 2 subdomain of a T3 RNAP comprises amino acids 786-884 of SEQ ID NO: 3.
In some aspects, a catalytic domain of a T7 RNAP comprises a fingers subdomain and palm subdomains. In some aspects, a catalytic domain of a T7 RNAP comprises SEQ ID NO: 4. In some aspects, the fingers subdomain of a T7 RNAP catalytic domain comprises amino acids 262-460 of SEQ ID NO: 4. In some aspects, the palm 1 subdomain of a T7 RNAP catalytic domain comprises amino acids 1-87 of SEQ ID NO: 4. In some aspects, the palm 2 subdomain of a T7 RNAP catalytic domain comprises amino acids 461-559 of SEQ ID NO: 4.
In some aspects, a catalytic domain of a SP6 RNAP comprises a fingers subdomain and palm subdomains. In some aspects, a catalytic domain of a SP6 RNAP comprises SEQ ID NO: 5. In some aspects, the fingers subdomain of a SP6 RNAP catalytic domain comprises amino acids 268-488 of SEQ ID NO: 5. In some aspects, the palm 1 subdomain of a SP6 RNAP catalytic domain comprises amino acids 1-104 of SEQ ID NO: 5. In some aspects, the palm 2 subdomain of a SP6 RNAP catalytic domain comprises amino acids 483-597 of SEQ ID NO: 5.
In some aspects, a catalytic domain of a T3 RNAP comprises a fingers subdomain and palm subdomains. In some aspects, a catalytic domain of a T3 RNAP comprises SEQ ID NO: 6. In some aspects, the fingers subdomain of a T3 RNAP catalytic domain comprises amino acids 242-459 of SEQ ID NO: 6. In some aspects, the palm 1 subdomain of a T3 RNAP catalytic domain comprises amino acids 1-87 of SEQ ID NO: 6. In some aspects, the palm 2 subdomain of a T3 RNAP catalytic domain comprises amino acids 460-559 of SEQ ID NO: 6.
In some instances, a mutant polymerase described herein comprises one or more amino acid mutations that alter the function of mutant polymerase compared to the corresponding wild-type (WT) polymerase. In some aspects, one or more mutation is comprised in the catalytic domain of the RNAP. Representative catalytic domains of a RNAP are SEQ ID NOs: 4-6 (from T7, SP6, and T3 RNAPs, respectively).
In some aspects, a mutant polymerase comprises a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 4, and wherein the mutant polymerase comprises one or more mutations at position V459, G231, V365, F222, M372, T405, P406, Q462, and/or D463relative to SEQ ID NO: 4.
In some aspects, a mutant polymerase comprises a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 5, and wherein the mutant polymerase comprises one or more mutations at position V481, G257, T391, H247, I398, L430, P431, H484, and/or D485 relative to SEQ ID NO: 5.
In some aspects, a mutant polymerase comprises a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 6, and wherein the mutant polymerase comprises one or more mutations at position V459, G231, V365, F222, M372, T405, P406, Q462, and/or D463 relative to SEQ ID NO: 6.
In some aspects, a mutant polymerase comprises a catalytic domain and an N-terminal domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with amino acid sequence corresponding to amino acids 1-324 of SEQ ID NO: 1, 1-297 of SEQ ID NO: 2, or 1-325 of SEQ ID NO: 3.
In some instances, the mutant polymerase comprises: (i) a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 4, (ii) one or more mutations at position V459, G231, V365, F222, M372, T405, P406, Q462, and/or D463 relative to SEQ ID NO: 4, and (iii) an N-terminal domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with amino acid sequence corresponding to amino acids 1-324 of SEQ ID NO: 1.
In some instances, the mutant polymerase comprises: (i) a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 5, (ii) one or more mutations at position V481, G257, T391, H247, I398, L430, P431, H484, and/or D485 relative to SEQ ID NO: 5, and (iii) an N-terminal domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with amino acid sequence corresponding to amino acids 1-297 of SEQ ID NO: 2.
In some instances, the mutant polymerase comprises: (i) a catalytic domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 6, (ii) one or more mutations at position V459, G231, V365, F222, M372, T405, P406, Q462, and/or D463 relative to SEQ ID NO: 6, and (iii) an N-terminal domain having at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with amino acid sequence corresponding to amino acids 1-325 of SEQ ID NO: 3.
In some aspects, the catalytic domain and the N-terminal domain of mutant polymerase disclosed herein are covalently linked. In other aspects, the catalytic domain and the N-terminal domain of mutant polymerase disclosed herein are non-covalently linked.
In some aspects, a mutant polymerase comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and comprises one or more mutations at position V783, G555, V689, F546, M696, T729, P730, Q786, and/or D787 relative to SEQ ID NO: 1.
In some aspects, a mutant polymerase comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 2 and comprises one or more mutations at position V778, G554, T688, H544, I695, L727, P728, H781, and/or D782 relative to SEQ ID NO: 2; or
In some aspects, a mutant polymerase comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 3 and comprises one or more mutations at position V784, G556, V690, F547, M697, T730, P731, Q787, and/or D relative to SEQ ID NO: 3.
In some aspects, a mutant polymerase comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and comprises one or more mutations at position V783, G555, V689, F546, M696, T729, P730, Q786, and/or D787 relative to SEQ ID NO: 1, wherein the mutant polymerase can bind the T7 promoter. In some aspects, a mutant polymerase further comprises (a) 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the polymerase catalytic domain; (b) 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the fingers subdomain; and/or (c) 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the palm subdomains.
In some aspects, a mutant polymerase comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 2 and comprises one or more mutations at position V778, G554, T688, H544, I695, L727, P728, H781, and/or D782 relative to SEQ ID NO: 2, wherein the mutant polymerase can bind the SP6 promoter. In some aspects, a mutant polymerase further comprises (a) 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the polymerase catalytic domain; (b) 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the fingers subdomain; and/or (c) 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the palm subdomains.
In some aspects, a mutant polymerase comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 3 and comprises one or more mutations at position V784, G556, V690, F547, M697, T730, P731, Q787, and/or D788 relative to SEQ ID NO: 3, wherein the mutant polymerase can bind the T3 promoter. In some aspects, a mutant polymerase further comprises (a) 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the polymerase catalytic domain; (b) 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the fingers subdomain; and/or (c) 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 mutations in the palm subdomains.
In some aspects, a mutant polymerase has at least 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% of the polymerase activity of the corresponding wild type polymerase when unmodified rNTPs are used for synthesizing RNA oligonucleotides from a DNA template. In some aspects, polymerase activity is measured such that one unit of the enzyme incorporates 1 nmol of AMP into a polynucleotide fraction in 60 minutes at 37° C.
In some aspects, a mutant polymerase has at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and comprises one or more of the following substitutions relative to SEQ ID NO: 1: a V783M, V783L, V783I, or V783C substitution; a G555L, G555M, G555I, G555V, or G555Y substitution; a V689Q, V689N, V689D, V689E, V689R, V689S, or V689W substitution; a F546E, F546M, or F546I substitution; a M696G or M696H substitution; a T729H, T729L, or T729R substitution; a P730Y substitution; a Q786M, Q786L, Q786N, or Q786W substitution; and/or a D787I substitution. In some aspects, the mutant polymerase comprises at least one substitution selected from V783M, G555L, and V689Q. In some aspects, the mutant polymerase comprises two or more substitutions selected from V783M, G555L, and V689Q. In some aspects, the mutant polymerase comprises V783M, G555L, and V689Q substitutions.
In some aspects, a mutant polymerase comprises one or more mutation at a position corresponding to V783, G555, and/or V689 of SEQ ID NO: 1 based on a sequence alignment. In some aspects, a mutant polymerase has at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with any of the sequence of SEQ ID NOs: 1-6, and 25-43, and comprises one or more mutation at a position corresponding to V783, G555, and/or V689 of SEQ ID NO: 1. Residues “correspond” to each other where they occur at equivalent positions in aligned amino acid sequences, such as phage-type RNA polymerase sequences and/or a domain thereof, such as e.g. a catalytic domain. Corresponding positions can be identified as positions that align with one another. Related or variant polypeptides are aligned by any method in the art. Such methods typically maximize matches, and include methods such as using manual alignments and by using any of the numerous alignment programs available (for example, BLASTP) and others known in the art. By aligning the sequences of polypeptides, one of skill in the art can identify corresponding residues, using conserved and identical amino acid residues as guides. In some embodiments, an amino acid of a polypeptide is considered to correspond to an amino acid in a disclosed sequence when the amino acid of the polypeptide is aligned with the amino acid in the disclosed sequence upon alignment of the polypeptide with the disclosed sequence to maximize identity and homology (e.g., where conserved amino acids are aligned) using a standard alignment algorithm, such as the BLASTP algorithm with default scoring parameters (such as, for example, BLOSUM62 Matrix, Gap existence penalty 11, Gap extension penalty 1, and with default general parameters). As a nonlimiting example, with reference to the multiple sequence alignment shown in FIGS. 11-1 to 11-5, amino acid residue 783 in SEQ ID NO: 1 corresponds to positions 783, 806, 784, 693 in SEQ ID NOs: 25, 30, 31, 43 (marked in grey in Figures).
In some aspects, amino acid substitutions identical or similar to those described above can be introduced to an phage-type RNA polymerase or a subsequence thereof. Alternative amino acid substitutions can be made using any of the techniques and guidelines for conservative and non-conservative amino acids as set forth, for example, by a standard Dayhoff frequency exchange matrix or BLOSUM matrix. Six general classes of amino acid side chains have been categorized and include: Class I (Cys); Class II (Ser, Thr, Pro, Ala, Gly); Class III (Asn, Asp, Gin, Glu); Class IV (His, Arg, Lys); Class V (Ile, Leu, Val, Met); and Class VI (Phe, Tyr, Trp). For example, substitution of an Asp for another class III residue such as Asn, Gln, or Glu, is a conservative substitution. As used herein, “non-conservative substitution” refers to the substitution of an amino acid in one class with an amino acid from another class; for example, substitution of an Ala, a class II residue, with a class III residue such as Asp, Asn, Glu, or Gln. Appropriate amino acid alterations allowed in relevant positions may be confirmed by testing the resulting modified RNA polymerases for activity in the in vitro assays known in the art or as described in the Examples below.
The mutant polymerases described herein can incorporate modified nucleotides, dNTPs, and ddNTPs to synthesize single-stranded nucleic acid. These mutant polymerases have advantages in a variety of methods involving nucleic acid synthesis.
In some aspects, a method for synthesizing a single-stranded nucleic acid comprises the steps of preparing a synthesis reaction mixture comprising a mutant RNA polymerase having enhanced ability to incorporate one or more modified nucleoside triphosphates, modified rNTPs, dNTPs, and/or ddNTPs compared with the wild type RNA polymerase, at least one nucleic acid template, and a mixture of nucleoside triphosphates; and performing a synthesis reaction under conditions that result in the production of one or more single-stranded nucleic acid. In some instances, mutant RNA polymerase comprising an amino acid sequence having at least 80%, 85%, 90%, 95%, 98%, or 99% sequence identity with a sequence selected from SEQ ID NOs: 1 to 6, and 25 to 43, and comprising at least one substitution or substitution set in said amino acid sequence is used. In further aspects, mutant RNA polymerase comprises at least one amino acid substitution at a position corresponding to the position V783, G555, and/or V689 of SEQ ID NO: 1. In some aspects, a mutant polymerase that comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and comprises one or more mutations at position V783, G555, V689, F546, M696, T729, P730, Q786, and/or D787 relative to SEQ ID NO: 1 is used. In further aspects, a mutant polymerase that comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and comprises one or more mutations at position V783, G555, and/or V689 relative to SEQ ID NO: 1 is used.
In other instances, in a method for synthesizing single-stranded nucleic acid, a mutant polymerase that comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 2 and comprises one or more mutations at position V778, G554, T688, H544, I695, L727, P728, H781, and/or D782 relative to SEQ ID NO: 2 is used. In other instances, a mutant polymerase that comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 3 and comprises one or more mutations at position V784, G556, V690, F547, M697, T730, P731, Q787, and/or D788 relative to SEQ ID NO: 3 is used.
In some aspects, the at least one nucleic acid template used in the method for synthesizing a single-stranded nucleic acid described herein comprises one or more promoter sequences recognized by the mutant polymerase. In some aspects, when, for example, a mutant polymerase that comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and comprises one or more mutations at position V783, G555 and/or V689 relative to SEQ ID NO: 1 is used, the at least one nucleic acid template comprises a T7 promoter operably linked to a target nucleotide sequence. In some aspects, the target nucleotide sequence is the sequence to be transcribed.
In some aspects, when, for example, a mutant polymerase that comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 2 and comprises one or more mutations at position V778, G554, and/or T688 relative to SEQ ID NO: 2 is used, the at least one nucleic acid template comprises a SP6 promoter operably linked to a target nucleotide sequence.
In some aspects, when, for example, a mutant polymerase that comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 3 and comprises one or more mutations at position V784, G556, and/or V690 relative to SEQ ID NO: 3 is used, the at least one nucleic acid template comprises a T3 promoter operably linked to a target nucleotide sequence.
In an aspect, synthesis reaction that takes place in a method for synthesizing a single-stranded nucleic acid described herein advantageously does not require primers. In further aspect, the synthesis reaction is performed without changes in reaction temperature, which makes the method more simple.
In some aspects, the mixture of nucleoside triphosphates used in the currently described method for synthesizing single-stranded nucleic acid comprises one or more nucleoside triphosphates selected from modified nucleoside triphosphates, modified rNTPs, dNTPs, ddNTPs, and modified NTPs.
In an aspect, the mixture of nucleoside triphosphates comprises one or more dNTPs. In some aspects, the mixture of nucleoside triphosphates may consist essentially of one or more dNTPs. In some aspects, the mixture of nucleoside triphosphates consists of 1, 2,3 or 4 dNTPs. In other aspects, one or more dNTPs are modified. In some aspects, one or more dNTPs are 2′-F modified. As a result, the nucleic acid synthesized according the currently described method will comprise one or more deoxyribonucleotides.
In some aspects, in a method for synthesizing a single stranded nucleic acid, a mutant polymerase that comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and comprises one or more mutations at position V783, G555 and/or V689 relative to SEQ ID NO: 1 is used, wherein at least one nucleic acid template comprises a T7 promoter operably linked to a target nucleotide sequence, and wherein the mixture of nucleoside triphosphates comprises one or more dNTPs. In some aspects, one or more dNTPs are 2′-F modified.
In some aspects, the mixture of nucleoside triphosphates used in the currently described method for synthesizing single-stranded nucleic acid comprises one or more 2′-modified rNTPs. In some aspects, the one or more 2′-modified rNTPs are selected from 2′-O-methyl, 2′-NH2, 2′-F, and 2′-methoxy ethyl rNTPs. In some aspects, the mixture of nucleoside triphosphates consists essentially of one or more 2′-modified rNTPs. As a result, the nucleic acid synthesized according the currently described method will comprise one or more 2′-modified ribonucleotides.
In some aspects, the mixture of nucleoside triphosphates used in the currently described method for synthesizing single-stranded nucleic acid comprises (i) one or more dNTPs and one or more rNTPs; (ii) three different dNTPs and one rNTP; (iii) two different dNTPs and two different rNTPs; or (iv) one dNTP and three different rNTPs. As a result, when synthesized by mutant polymerase according to present disclosure using the method for synthesizing single-stranded nucleic acid, the synthesized nucleic acid may comprise deoxyribonucleotides and ribonucleotides. In certain aspects, the one or more rNTPs are modified, in further aspects, the one or more rNTPs may be 2′-modified.
In some aspects, the mixture of nucleoside triphosphates used in the currently described method for synthesizing single-stranded nucleic acid comprises dTTP, dCTP, ATP, and GTP; dTTP, CTP, ATP, and dGTP; dTTP, dCTP, dATP, and GTP; dTTP, dCTP, dATP, and 2′-F-dGTP; dUTP, dCTP, ATP, and GTP; or dUTP, dCTP, dATP, and GTP. In some aspects, in a method for synthesizing a single stranded nucleic acid, a mutant polymerase that comprises at least 80%, 85%, 90%, 95%, 98% or 99% sequence identity with SEQ ID NO: 1 and comprises one or more mutations at position V783, G555 and/or V689 relative to SEQ ID NO: 1 is used, wherein the mixture of nucleoside triphosphates comprises one of the mixtures: dTTP, dCTP, ATP, and GTP; dTTP, CTP, ATP, and dGTP; dTTP, dCTP, dATP, and GTP; dTTP, dCTP, dATP, and 2′-F-dGTP; dUTP, dCTP, ATP, and GTP; or dUTP, dCTP, dATP, and GTP.
In some aspects, in a method for synthesizing a single stranded nucleic acid the mixture of nucleoside triphosphates comprising dNTPs and/or rNTPs, further comprises one or more ddNTP. The one or more ddNTP may be modified. For example, the modified ddNTP may further comprise a reporter moiety or an oligonucleotide attached to a linker attached to the ddNTP. The reporter moiety may include any suitable chemical or substance that may be detected as a signal or contrast using imaging techniques, or it can be any moiety capable of binding to a substrate, for example, a magnetic bead, a chromatography column bound with, for example, avidin, streptavidin, antigen, antibody, and the like. In some aspects, such moiety can be selected from biotin, iminobiotin, avidin, and streptavidin.
In some aspects, the mixture of nucleoside triphosphates comprises one or more oligonucleotide-tethered nucleotide. Examples and aspects of oligonucleotide-tethered nucleotides are described in part “A” below.
In some aspects, the synthesis reaction mixture used in the currently disclosed method of synthesizing single-stranded nucleic acid further comprises a cap or cap analog. As an example, “cap” may refer to the guanine nucleoside that is joined via its 5′ carbon to a triphosphate group that is, in turn, joined to the 5′ carbon of the most 5′ nucleotide of a transcript. In some examples, the nitrogen at the 7 position of guanine in the cap is methylated. As an example, “cap analog” may refer to a dinucleotide, tri- or tetranucleotide containing a 5′-5′ di-, tri-, or tetra-phosphate linkage between the first and the second nucleotide. One end of the cap analog terminates in a either a guanosine or substituted guanosine residue; it is this end from which RNA polymerase will initiate transcription by extending from the 3′ hydroxyl. The second nucleotide of cap analog is a guanosine that mimics the eukaryotic cap structure, and may have 7-methyl-, 7-benzyl-, or 7-ethyl-substitutions and/or 7-aminomethyl or 7-aminoethyl substitutions. As another example of “cap analog”, the terms “ARCA” and “anti-reverse cap analog” refer to chemically modified forms of cap analogs, designed to maximize the efficient of in vitro translation by ensuring that the cap analog is properly incorporated into the transcript in the correct orientation.
In some aspects, the one or more nucleic acid synthesized using the currently described method for synthesizing single-stranded nucleic acid comprise deoxyribonucleotides and/or ribonucleotides. In some aspects, the one or more nucleic acid comprise deoxyribonucleotides. As such, the one or more nucleic acid may comprise DNA. In some aspects, the one or more nucleic acid comprise deoxyribonucleotides and ribonucleotides. As such, the one or more nucleic acid may comprise RNA/DNA chimeric sequence.
In some aspects, the one or more nucleic acid comprise ribonucleotides. As such, the one or more nucleic acid may comprise RNA. In some aspects, the one or more nucleic acid comprising ribonucleotides are an RNA aptamer, a ribozyme, an siRNA, an miRNA, or an antisense RNA. In further aspects, the nucleic acid comprises canonical and non-canonical (e.g. modified) ribonucleotides.
In some aspects, the one or more nucleic acid comprise deoxyribonucleotides and ribonucleotides. As such, the one or more nucleic acid may comprise a DNA/RNA chimera.
In some aspects, the one or more nucleic acid synthesized by the method of synthesizing a single-strande nucleic acid described herein comprise from about 8 to about 2000 nucleotides, including intermediate ranges, such as from about 100 to about 1500 nucleotides, from about 800 to about 1500 nucleotides, from about 800 to about 1000 nucleotides, as from about 10 to about 50 nucleotides, from about 15 to about 35 nucleotides, from about 18 to about 75 nucleotides and from about 25 to about 150 nucleotides.
The present mutant polymerases have an ability to incorporate modified dNTPs bearing bulky groups attached to their nucleobases, such as oligonucleotide-tethered oligonucleotides, and these oligonucleotide-tethered nucleotides can be incorporated into a growing nucleic acid strand by a mutant polymerase during nucleic acid synthesis.
By using appropriately designed oligonucleotide-tethered ddNTPs (OTDDNs, as shown in FIG. 9A) it is possible to generate DNA ends that are compatible with sequencing on various platforms, including but not limited to the Illumina platform.
While not being bound by theory, it is believed that efficient incorporation of modified nucleotides, during nucleic acid synthesis is highly dependent on the size of attached label. More importantly, the length of linker, between nucleotide heterocyclic base and label, has the significant impact on incorporation. The linker should be long enough to reduce label steric hinderance and changes of nucleotide steric structure. At the same time, it has to be short enough to avoid back-folding. Moreover, the terminal functional groups of the linker must be tolerated by the polymerase enzymes. A properly designed linker will allow to incorporate nucleotides bearing large labels.
When the oligonucleotide-tethered nucleotide has 3′-H instead of the 3′-hydroxyl group (dideoxy-modified nucleotide), incorporation of such the oligonucleotide-tethered nucleotide would terminate the DNA synthesis. Using an oligonucleotide-tethered dideoxynucleotide and target-specific or random primers with a universal sequence, a set of randomly terminated fragments is generated, which can then be subjected to PCR conditions for platform-specific full-length sequencing adaptor introduction. In some aspects, this method can also be used to overcome the need for nucleic acid fragmentation. The oligonucleotide-tethered nucleotides may optionally be biotin-modified to facilitate enrichment.
The below disclosure describes tethering of an oligonucleotide to any nucleotide and its later incorporation into nucleic acid sequence while performing strand synthesis with nucleic acid polymerase. Also, this method provides advantages to attach an oligonucleotide to any final nucleotide of any nucleic acid sequence composition.
In some aspects, the oligonucleotide-tethered nucleotides used herein generally have a structure according to formula (A), or a salt thereof:
wherein NB is a nucleobase; Oligo is an oligonucleotide of 3 to 100 nucleotides; each of X and Q are independently chosen from, H, OH, N3, halo, alkyl , alkoxy, alkyl, alkenyl, alkynyl, acyl, cyano, amino, ester, and amido; each of Z and Y are independently chosen from a bond, amino, amido, alkylene, alkenylene, alkynylene, thioether, sulfonyl, sulfonamido, ether, ketone, carbonyl, anhydride, ester, imido, urea, urethane, and combinations thereof; and CXN is chosen from alkylene, alkenylene, alkynylene, ketone, carbonate, ester, ether, anhydride, amido, amino, aminoalkyl, imino, imido, diazo, carbamate ester, phosphodiester, sulfide, disulfide, sulfonyl, sulfonamido, and a heterocyclic group containing from one to four N, O, S atom(s) or a combination thereof where heterocyclic group is optionally substituted at carbon, nitrogen or sulfur atom(s).
In further aspects, the oligonucleotide-tethered nucleotides used herein may have a structure according to formula (I), or a salt thereof:
wherein X is H, OH or N3, NB represents a nucleobase, Z and Y are linkers, Oligo represents an oligonucleotide of 3 to 100 nucleotides in length, and Click represents the reaction product of a Click reaction, which covalently binds the Z and Y linkers. In some aspects, the nucleobase is chosen from adenine, 7-deaza-adenine, cytosine, guanine, 7-deazaguanine, thymine, uracil and inosine. In some aspects, Z and Y each independently comprise at least one linking moiety chosen from amino, amido, alkyl, alkenyl, alkynyl, thioether, sulfonyl, sulfonamido, ether, ketone, carbonyl, anhydride, ester, imide, urea, urethane, or any combination thereof, or any combination thereof.
Alternatively, the oligonucleotide-tethered nucleotide can be acyclic (I′).
The Click reaction product includes the products of reactions such as, but not limited to, copper catalyzed azide-alkyne cycloaddition (CuAAC); strain-promoted azide-alkyne cycloaddition (SPAAC) also known as copper-free click chemistry; strain-promoted alkyne-nitrone cycloaddition (SPANC); alkyne hydrothiolation; and alkene hydrothiolation.
In some aspects, the Click reaction is a (3+2) cycloaddition reaction of an azide and an alkyne, resulting an 1,2,3-triazole. The reaction product provides triazole product, thereby providing an oligonucleotide-tethered nucleotide of formula (II), or a salt thereof:
wherein, X, NB, Z, Y and Oligo are as defined above. In formula (II), one of Z and Y is covalently bound to the 1 position of the triazole, while the other of Z and Y is covalently bound to the 4 or 5 position of the triazole. In one aspect, X is OH, and in another aspect, X is H, and in yet another aspect X is N3.
In some aspects, the linkers Z and/or Y include a carbon-based chain, for example an alkyl chain having 1 to 12 carbon atoms that may be linear or branched. In some aspects, the alkylene is a straight or branched C1-C6 alkylene. Linkers Z and/or Y may also include a straight or branch alkenylene having 2 to 12 carbons. Alternatively, the alkenylene is a straight or branched C2-C6 alkenylene. In some aspects, linkers X and Y include a straight or branched alkynylene chain of 2 to 12 carbons. In some aspects, the alkynylene is a straight or branched C2 to C6 alkynylene.
In some aspects, X and/or Y includes a polyalkylene glycol having from 2 to 20 alkylene glycol units, while in other aspects, the polyalkylene glycol has 2 to 8 alkylene glycol units. In some aspects, the polyalkylene glycol has 2, 4, or 6 to 8 glycol units. Suitable alkylene glycol units include ethylene glycol, 1,2-propane-diol, 1,2-butylene glycol, and the like.
The oligonucleotide-tethered nucleotide may more particularly have the structure of formula (III), or a salt thereof:
wherein L1 and L2 are each linkers independently comprising an alkylene, an alkynylene, a polyalkylene glycol, or any combination thereof.
In some aspects, the oligonucleotide-tethered nucleotide may have the structure of formula (III), or a salt thereof, wherein L1 is a linker comprising an alkylene, a polyalkylene glycol, or a combination thereof, and L2 is a linker comprising an alkynylene having from 2 to 12 carbons. More particularly, L2 is hexynyl. The polyalkylene glycol may be a polyethylene glycol having from 2 to 6 ethylene glycol units. In another aspect, L1 comprises an alkylene having 1 to 12 carbon atoms. More particularly, the alkylene is methylene, ethylene, n-propylene, isopropylene, 1-butylene, cis-2-butylene, trans-2-butylene, isobutylene, 1-pentylene, cis-2-pentylene, trans-2-pentylene, isopentylene, or hexylene.
Alternatively, when strain-promoted azide-alkyne cycloaddition (SPAAC) also known as copper-free click chemistry, is used to generate oligonucleotide-tethered nucleotides, the resulting oligonucleotide-tethered nucleotides described herein generally have a structure according to formula (IV), or a salt thereof:
wherein X is H or OH or N3, NB represents a nucleobase, Z and Y are linkers, Oligo represents an oligonucleotide of 3 to 100 nucleotides in length. In some aspects, the nucleobase is chosen from adenine, 7-deaza-adenine, cytosine, guanine, 7-deazaguanine, thymine, uracil and inosine. In some aspects, Z and Y each independently comprise at least one linking moiety chosen from — amino, amido, alkyl, alkenyl, alkynyl, thioether, sulfonyl, sulfonamido, ether, ketone, carbonyl, anhydride, ester, imide, urea, urethane, or any combination thereof, or any combination thereof.
Alternatively, in some aspects azide modification can be introduced at the 3′ position of the nucleotide and therefore oligonucleotide can be covalently tethered to the 3′ position of the nucleotide (V and VI).
NB represents a nucleobase Y is a linker, Oligo represents an oligonucleotide of 3 to 100 nucleotides in length. In some aspects, the nucleobase is chosen from adenine, 7-deaza-adenine, cytosine, guanine, 7-deazaguanine, thymine, uracil and inosine. In some aspects, Z and Y each independently comprise at least one linking moiety chosen from —C(O)NH— , —C(O)C)— ,— NH—, —S— , —O—, alkyl, alkenyl, and alkynyl, or any combination thereof.
The oligonucleotide-tethered nucleotide of the present disclosure comprises, in some aspects, a pyrimidine nucleobase. In these aspects, the pyrimidine nucleobase is bound to the oligonucleotide at the 5 position of the pyrimidine (see formula A, below). Alternatively, when the oligonucleotide-tethered nucleotide comprises a purine base, the purine nucleobase is bound to the oligonucleotide at the 7 position of the nucleobase. In other aspects, the oligonucleotide is bound to the nucleotide at the 3′ position of the nucleotide (see formula B)
Salts of the oligonucleotide-tethered nucleotides of the present disclosure include quaternary ammonium salts, sodium salts, potassium salts and the like.
The oligonucleotide used herein may comprise a barcode sequence, an adapter sequence, a unique molecular identifier, or any combination thereof. The oligonucleotide used is not limited to any specific sequence. Further, in some aspects, the oligonucleotide is tethered to the nucleotide at its 5′ end. In some aspects, alkyne modification is added to the oligonucleotide nucleobase via a spacer of 8 carbon atoms and is referred as “Ald” modification, or alternatively the alkyne group is attached to the phosphate of the 5′ terminus of the oligonucleotide via hexynyl linker and is referred as “Alxyl” modification. In some examples, 3′ end of the oligonucleotide has biotin, phosphate, amine or phosphorothioate modifications.
Various strategies may be used to prepare the oligonucleotide-tethered nucleotides and the claimed compositions and methods of using them are not limited by any description of the methods of making these useful compounds.
The term “click chemistry” is well understood in the art in the art and generally refers to fast reactions that easily purified and regiospecific. Click chemistry includes without limitation copper catalyzed azide-alkyne cycloaddition (CuAAC); strain-promoted azide-alkyne cycloaddition (SPAAC), also known as copper-free click chemistry; strain-promoted alkyne-nitrone cycloaddition (SPANC); alkyne hydrothiolation; and alkene hydrothiolation.
As used herein, and unless otherwise indicated, the terms “contacting,” “adding,” “reacting,” “treating,” or the like means contacting one reactant, reagent, solvent, catalyst, reactive group or the like with another reactant, reagent, solvent, catalyst, reactive group or the like. Reactants, reagents, solvents, catalysts, reactive groups or the like can be added individually, simultaneously or separately and can be added in any order that achieves a desired result. They can be added in the presence or absence of a heating or cooling apparatus and can optionally be added under an inert atmosphere.
In some aspects, a method for preparing an oligonucleotide-tethered nucleotide according to the present disclosure comprises providing a nucleotide covalently bound to a first functional group capable of undergoing a click reaction with a second functional group; providing an oligonucleotide covalently bound to the second functional group capable of undergoing a click reaction, wherein the first and second functional groups are respectively chosen from the following pairs: alkynyl and azido; azido and alkynyl, thiol and alkynyl; alkynyl and thiol; thiol and alkenyl; alkenyl and thiol; azido and cyclooctanyl; cyclooctanyl and azido; nitrone and cyclooctanyl; cyclooctanyl and nitrone; contacting the nucleotide with the oligonucleotide in the presence of a copper catalyst and copper (I) ligand to form a click reaction product.
In a particular aspect, the method comprises a click reaction of an azide and an alkyne to form a 1,2,3-triazole. Azides and terminal or internal alkynes can undergo a 1,3-dipolar cycloaddition (Huisgen cycloaddition) reaction to give a 1,2,3-triazole. However, this reaction requires long reaction times and elevated temperatures. Alternatively, azides and terminal alkynes can undergo Copper(I)-catalyzed Azide-Alkyne Cycloaddition (CuAAC) at room temperature. Such copper(I)-catalyzed azide-alkyne cycloadditions, also known as “click chemistry,” is a variant of the Huisgen 1,3-dipolar cycloaddition, wherein organic azides and terminal alkynes react to give 1,4-regioisomers of 1,2,3-triazoles. Examples of “click” chemistry reactions are described by Sharpless et al. (U.S. Pat. Application Publication No. 20050222427, published Oct. 6, 2005, PCT/ US03/17311; Lewis W. G. et al., Angewandte Chemie-Int’l Ed. 41 (6): 1053; method reviewed in Kolb, H. C., et al., Angew. Chem. Inst. Ed. 2001, 40:2004-2021), which developed reagents that react with each other in high yield and with few side reactions in a heteroatom linkage (as opposed to carbon-carbon bonds) in order to create libraries of chemical compounds.
The copper used as a catalyst for the “click chemistry” reaction used in the methods described herein to conjugate a label (reporter group, solid support or carrier molecule) to a nucleic acid is in the Cu (I) reduction state. The sources of copper(I) used in such copper(I)-catalyzed azide-alkyne cycloadditions can be any cuprous salt including, but not limited to, cuprous halides such as cuprous bromide or cuprous iodide. However, this regioselective cycloaddition can also be conducted in the presence of a metal catalyst and a reducing agent.
In certain aspects, copper can be provided in the Cu(II) reduction state (for example, as a salt, such as but not limited to Cu(NO3)2, Cu(OAc)2, or CuSO4), in the presence of a reducing agent wherein Cu(I) is formed in situ by the reduction of Cu(II). Such reducing agents include, but are not limited to, ascorbate, Tris(2-Carboxyethyl) Phosphine (TCEP), 2.4.6-trichlorophenol (TCP), NADH, NADPH, thiosulfate, metallic copper, quinone, hydroquinone, Vitamin K, glutathione, cysteine, 2-mercaptoethanol, dithiothreitol, Fe(II), Co(II), or an applied electric potential. In other aspects, the reducing agents include metals chosen from Al, Be, Co, Cr, Fe, Mg, Mn, Ni, Zn, Au, Ag, Hg, Cd, Zr, Ru, Fe, Co, Pt, Pd, Ni, Rh, and W. In particular aspects, the reducing agent is ascorbate.
In some aspects, the (3+2) cycloaddition of azides and alkynes is conducted in the presence of a ligand. While not being bound by theory, the ligand is believed to stabilize the Cu(I) ion, thereby preventing its oxidation to the Cu(II) ion. 3-[4-({bis[(1-tert-butyl-1H-1,2,3-triazol-4-yl)methyl]amino}methyl)-1H-1,2,3-triazol-1-yl]propanol (BTTP); 3-[4-({bis[(1-tert-butyl-1H-1,2,3-triazol-4-yl)methyl]amino}methyl)-1H-1,2,3-triazol-1-yl]propyl hydrogen sulfate (BTTPS); 2-[4-({bis[(1-tert-butyl-1H-1,2,3-triazol-4-yl)methyl]amino}methyl)-1H-1,2,3-triazol-1-yl]ethyl hydrogen sulfate (BTTES); bathophenanthroline disulphonate disodium salt (BTTAA); Nε-((1R,2R)-2-azidocyclopentyloxy)carbonyl)-L-lysine (BPS); pentamethyldiethylenetriamine (PMDETA); tris(2-benzimidazolylmethyl)amine ((BimH)3) tris-(benzyltriazolylmethyl)amine (TBTA); and tris(3-hydroxypropyltriazolylmethyl)amine (THPTA). In a particular aspect, the ligand is THPTA.
The copper(I)-catalyzed azide-alkyne cycloadditions for labeling nucleic acids can be performed in water and a variety of solvents, including mixtures of water and a variety of (partially) miscible organic solvents including alcohols, dimethyl sulfoxide (DMSO), dimethyl formamide (DMF), tert-butanol (tBuOH) and acetone.
The mutant polymerases described herein can be used in a variety of methodologies, including the exemplary methods listed below and those in the Examples.
In some aspects, a method comprises an amplification reaction, wherein at least one or more of the nucleic acids synthesized by the mutant polymerase serves as a primer. The appropriate length of such primer or primers depends on the intended use of the primer but typically ranges from about 8 to about 200 nucleotides, including intermediate ranges, such as from about 10 to about 50 nucleotides, from about 15 to about 35 nucleotides, from about 18 to about 75 nucleotides and from about 25 to about 150 nucleotides. In some aspects, the primer comprises up to 50 deoxyribonucleotides. In some aspects, the primer is comprised of deoxyribonucleotides and ribonucleotides. In some aspects, the amplification is template dependent. In some aspects, the amplification is template independent. In some aspects, the template independent amplification is terminal deoxynucleotidyl transferase (TdT) amplification.
In some aspects, a PCR amplification reaction is performed after production of one or more nucleic acid comprising deoxyribonucleotides and ribonucleotides. In some aspects, a PCR amplification reaction is performed after production of one or more nucleic acid comprising deoxyribonucleotides. In some aspects, the amplification reaction does not require addition of primers other than one or more nucleic acids synthesized by mutant polymerases according to the current disclosure.
In some aspects, the amplification reaction is performed by a reverse transcriptase.
In some aspects, the method is for production of barcoded nucleic acid oligonucleotides, enzymatic primer synthesis, unbiased amplification of specific targets, whole genome amplification, or tagging via in vitro transcription.
In some aspects, nucleic acids synthesized by a mutant polymerase are used for amplicon sequencing or preparation of a sequencing library.
Enzymatic synthesis of ssDNA would be useful for many applications. In some aspects, during in vitro transcription reaction, a mutant polymerase according to current disclosure produces long (>100 nt) DNA oligonucleotides with improved efficiency and fidelity compared to other polymerases. In some aspects, synthesis of ssDNA from an inserted promoter sequence allows isothermal amplification of DNA of individual targets or whole genomes. In some aspects, highly multiplexed enzymatic primer synthesis, where a pool of plasmids which would serve as in vitro transcription templates, is used for multiplexed enzymatic primer synthesis. In some aspects, single-stranded DNA molecules longer than 1 kb are enzymatically synthesized by a mutant polymerase described herein. For example, such enzymatically synthesized ssDNA molecules may be used for example, as donor DNA in genome editing applications, in transfection or in the field of next generation sequencing and other applications. In some aspects, the enzymatically synthesized molecules may also comprise ribonucleotides.
In addition, as an example, T7 RNA polymerase and mutant variants as described herein accept base-modified nucleotides, thus this enzymatic approach can be useful for the production of functionalized ssDNA.
Some applications may require generation of primers starting from a single or a few molecules, for example, if introduction of unique barcode sequences within the reaction vessel is desired. In some aspects, a single or a few in vitro transcription (IVT) templates are provided, including but not limited to plasmids, PCR products, and oligonucleotide duplexes bearing a T7 RNA polymerase promoter sequence, and these templates are subjected to IVT with polymerase mutants described herein with a pool of nucleoside triphosphates including dNTPs or other modified or non-canonical nucleotides. The resulting molecules can be readily used in PCR or any other priming-based reactions.
In some applications synthesis of oligonucleotide pools bearing randomized regions might be desirable. In such applications, a library of IVT templates can be used to simultaneously synthesize a library of corresponding oligonucleotides by mutant polymerases described herein within a single IVT reaction in the presence of dNTPs and modified or non-canonical nucleotides. The resulting molecules can be readily used in PCR or any other priming-based reaction.
mRNA can be amplified in an Eberwine-like fashion by polymerase mutants described herein in the presence of dNTPs or modified or non-canonical nucleotides.
The “Eberwine Method” has been extensively described (See, for example, Marko et al., BMC Genomics 2005, 6:27 (2005)). In the Eberwine method, RNA templates are primed with an oligo(dT) primer that has been 5′ modified to contain a promoter for the T7 RNA polymerase and are subsequently reverse transcribed into first-strand cDNA. The RNA-cDNA hybrid is then treated with E. coli RNAse H, and priming for second-strand cDNA synthesis occurs either by RNA nicking and priming or by cDNA hairpinning. Then, second-strand cDNA synthesis is carried out with E. coli DNA polymerase and E. coli DNA ligase followed by blunt-ending with T4 DNA polymerase. Transcription and amplification are then accomplished using the T7 RNA polymerase, which binds to the T7 promoter introduced during first-strand cDNA synthesis, producing antisense RNA (aRNA). When polymerase mutants described herein are used in the presence of deoxy- or non-canonical nucleotides, the resulting aRNA in a chimeric, more stable form may be produced. Alternatively, when dNTPs are used as described herein, single stranded DNA may be produced during such amplification.
RNA-based vaccines are a promising therapeutic tool for gene delivery. Improving the stability of such therapeutic transcripts can improve the production process as well as the robustness of immune response. More nuclease-resistant and chemically stable RNA consisting of certain non-canonical modified nucleotides might be synthesized by IVT using mutant polymerases described herein. RNA synthesized via in vitro transcription (IVT) can also be capped during the transcription reaction (in a process called “co-transcriptional capping”), when a cap analog (i.e. synthetic analogs of the N7-methylated guanosine triphosphate cap) is included in the IVT reaction along with other nucleotide triphosphates.
Aptamers are short single-stranded DNA or RNA molecules that can bind to certain target molecules with high affinity and specificity. Aptamers can be produced by IVT using mutant polymerases described herein. Enzymatic synthesis may enable the simultaneous production of certain pools of aptamers for high-throughput profiling studies.
Certain sequencing-ready library preparation methods, including but not limited to CEL-Seq (See Hashimshony T, et al. Cell Rep. 2(3):666-73 (2012)), CEL-Seq2 (See Hashimshony T, et al. Genome Biol. 17:77 (2016)), and LIANTI (See Chen C, et al. Science. 356(6334): 189-194 (2017)), contain a linear amplification step which is performed by IVT. Target DNA or cDNA is amplified in RNA form and then converted back to DNA to produce NGS libraries. Linear amplification directly into ssDNA or chimeric nucleic acid form which may be tolerated by DNA polymerases and directly amplified might improve such workflows by omitting RNA to DNA conversion step. Such improvement may be achieved by using mutant polymerases described herein.
This disclosure has been described broadly and generically herein. Each of the narrower species and subgeneric groupings falling within the generic disclosure also form part of the disclosure. This includes the generic description with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein. Other embodiments are within the following claims. In addition, where features are described in terms of Markush groups, those skilled in the art will recognize that the features are hereby described in terms of any individual member or subgroup of members of the Markush group.
One skilled in the art would readily appreciate that the present disclosure is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those inherent therein. Further, it will be readily apparent to one skilled in the art that varying substitutions and modifications may be made without departing from the scope and spirit. The compositions, methods, procedures, treatments, molecules and specific compounds described herein are presently representative of preferred aspects are exemplary and are not intended as limitations on the scope. Changes therein and other uses will occur to those skilled in the art which are encompassed within the spirit are defined by the scope of the claims. The listing or discussion of a previously published document in this specification should not necessarily be taken as an acknowledgement that the document is part of the state of the art or is common general knowledge.
The embodiments illustratively described herein may suitably be practiced in the absence of any element or elements, limitation or limitations, not specifically disclosed herein. Thus, for example, the terms “comprising”, “including,” containing”, etc. shall be read expansively and without limitation. The word “comprise” or variations such as “comprises” or “comprising” will accordingly be understood to imply the inclusion of a stated integer or groups of integers but not the exclusion of any other integer or group of integers. Additionally, the terms and expressions employed herein have been used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope claimed. Thus, it should be understood that although the present disclosure has been specifically disclosed by exemplary embodiments and optional features, modification and variation o embodied therein herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this disclosure.
The content of all documents and patent documents cited herein is incorporated by reference in their entirety.
A T7 RNA polymerase (T7 RNAP) mutant library was constructed using the strategy of site-saturation mutagenesis (See Packer and Liu, Nature Rev Genetics 16(7): 379-394 (2015)). A total of 77 amino acid positions were chosen for mutagenesis, based on their proximity to the location of the substrate incorporation site in a structural model of the T7 RNAP (Cheetham et al, 1999). All selected amino acids are located within 10 Å radius from His784 and Gly542 amino acids, which are believed to be responsible for the enzyme and substrate interaction (See Cheetham and Steitz TA. Science 286:2305-9 (1999)). 33 of selected amino acids belong to fingers subdomain and the remaining 44 amino acids belong to palm subdomain.
Selected amino acid positions of T7 RNAP for site-saturation mutagenesis were M420, D421, W422, R423, G424, R425, V426, Y427, G538, S539, C540, S541, G542, I543, Q544, H545, F546, S547, A548, M549, L550, D552, G555, G556, V559, N560, L561, I570, Y571, V634, M635, T636, L637, A638, Y639, G640, V689, A691, A692, V693, A695, M696, L699, W727, T729, P730, D731, F733, V735, Q737, Y739, S776, G777, I778, A779, P780, N781, F782, V783, H784, S785, Q786, D787, G788, S789, H790, L791, R792, H811, D812, F814, M832, Y836, V841, L842, F882, and A883.
Saturation mutagenesis of 77 amino acids results in unique 1463 single-mutation mutants of T7 RNAP (multiplication of 77 amino acid positions and 19 different possible amino acids results in 1463).
Saturation mutagenesis was performed by Invitrogen™ GeneArt™ as a gene synthesis service (www.thermofisher.com/lt/en/home/life-science/cloning/gene-synthesis/geneart-gene-synthesis.html). 77 separate sub-libraries were assembled from synthetic oligonucleotides with the objective to introduce one substitution per molecule at defined positions. The pooled sub-libraries were digested with R.XhoI (Thermo Scientific™, FD0695) and R.HindIII (Thermo Scientific™, FD0504) and ligated into vector pBADHis_A_A236 (V43001, Thermo Scientific). Ligation reactions were transformed into E. coli strain DH10B-T1R (Invitrogen, C640003). Total cells from the transformation plates were harvested for glycerol stock preparation. Cells were resuspended in 50% glycerol at 3 x 1 ml (1.3× 1010 cells/ml). 48 colonies of the transformation plates (LB agar medium: 1.2 % of agar, 1 % of tryptone, 0.5 % of yeast extract, 0.5 % of NaCl solution with 100 µg/mL of ampicillin) were picked. 0 of the 48 clones showed empty plasmids in the colony-PCR control, and 35 clones were sequence analysed (Sanger sequencing). 6 of the 35 clones contained erroneous sequences with substitution (6x) or deletion (0x) or insertion (0x) within the reading frame (data not shown). 29 of the 35 clones contained correct sequences (resulting library correctness is 83%).
In order to evaluate substitution rate and distribution in the final mutant library, it was analysed by NGS, Illumina™ MiSeq™ platform. Plasmid DNA pool was purified from E. coli glycerol stock using GeneJET™ Plasmid Miniprep Kit (Thermo Scientific, K0503). Purified plasmid DNA was sonicated and analysed on Agilent™ 2100 Bioanalyzer using High Sensitivity DNA Kit (Agilent Technologies, 5067-4626). Average size of fragmented DNA was 350 bp. Plasmid DNA library was prepared for NGS using Collibri™ PCR-free PS DNA Library Prep Kit for Illumina Systems (Invitrogen™, A38609024). Size distribution and quality of prepared DNA library verified by performing capillary electrophoresis analysis on Agilent™ 2100 Bioanalyzer instrument using the Agilent High Sensitivity DNA Kit (Agilent Technologies, 5067-4626). Prepared library was quantified by qPCR using Collibri™ Library Quantification Kit (Invitrogen, A38524100). The resulting library was sequenced on the Illumina MiSeq™ using the MiSeq Reagent Kit v2 (Illumina, CA, USA, MS-102-2002), 300-cycles (Illumina, CA, USA); 2 × 151 bp paired-end reads were performed. Data analysis revealed all the target substitutions were present in the library. Frequency to coverage values varied from 0.001% to 0.6% in the final library.
Expression level and solubility of total proteins from the mutant library were subsequently evaluated as follows. E. coli DH10B-T1R cells bearing plasmid DNA library (T7 RNAP mutant library) were cultivated in LB medium (1 % of tryptone, 0.5 % of yeast extract and 0.5 % of NaCl solution) containing 100 µg/mL of ampicillin for ~16 h at 37° C. 220 rpm. 1/100 of overnight culture were transferred to fresh LB medium containing 100 µg/mL of ampicillin. Cells were cultivated at 37° C. 220 rpm until optical density (OD600) of 0.7-0.8 was reached, and L-arabinose was added to 10 mM of final concentration and mutant protein expression was performed for 4 h at 37° C. 220 rpm. After protein expression, cells were collected by centrifugation and resuspended to concentration of 1 optical unit (OD600) in 100 µl of 10 mM Tris-HCl, 1 mM EDTA buffer solution (pH 8). Cells were then sonicated. Soluble and insoluble protein fraction were separated by centrifugation: 35 µl of cell lysate (total proteins) centrifuged for 15 min at 14 000 rpm 4° C. Samples for SDS-PAGE prepared by adding NuPAGE™ LDS Sample Buffer (Invitrogen, NP0007) to 1× final concentration and DTT to final concentration of 100 mM. Samples were incubated at 95° C. for 5 min and then chilled on ice. ⅕ of each sample were used for SDS-PAGE in Novex™ 10% Tris-Glycine Mini Gel (Invitrogen, XP00100BOX) with 1x Novex™ Tris-Glycine SDS Running Buffer (Invitrogen, LC26754). Gel was washed and stained using PageBlue™ Protein Staining Solution (Thermo Scientific™, 24620).
Selection was performed to identify T7 polymerase mutants with expanded substrate range, such as being able to incorporate deoxyribonucleotides.
In vitro compartmentalization is an emulsion-based technology commonly used for protein in vitro evolution (Tawfik and Griffiths Nature Biotechnol 16:652-656 (1998)). Conventional enzyme in vitro evolution experimental systems, such as compartmentalized self-replication (CSR) approach, are limited in their ability to manage and quality control the selection process. Results obtained using such systems are often questionable and may drive to false conclusions. Precise and/or quantitative experiments using this approach are difficult because of relatively high polydispersity of the droplets. Moreover, addition of new reagents to the droplets once they are formed, real time detection and sorting of droplets are impossible.
Microfluidics-based in vitro evolution design in combination with fluorescence-activated droplet sorting techniques as used herein allowed for precision and control over nearly all experimental aspects, which in turn allows for fast and highly confident selection of enzyme mutants of interest (See Aharoni et al., PNAS 101:482-487 (2004); Agresti et al., PNAS 107(9):4004-4009 (2010); and Kintses et al., Chemistry & Biology 19(8):929-931 (2012)). Microfluidic technology allowed reproducible preparation of emulsions and introduced automation and real time monitoring of in vitro evolution experiments.
The main steps of T7 RNA polymerase mutant library screening for mutants that are able to use dNTPs as substrates include:
- 1. Generation of a library of mutant T7 RNAP genes and cloning it under the control of inducible promoter;
- 2. Design and construction of in vitro transcription (IVT) template which has T7 promoter sequence and a downstream sequence (~300 nt) unrelated to the sequence of T7 RNAP gene;
- 3. Expressing T7 RNAP mutants in Escherichia coli cells;
- 4. Co-encapsulation of individual E. coli cells with in vitro transcription buffer, cell lysis agent, dTTP, dCTP, ATP, GTP, RNase A, IVT template and molecular beacon probes into droplets; molecular beacon probes may hybridize to the sequence of IVT product;
- 5. Incubation of resulting emulsion at 37° C. to allow for the transcription of IVT template and degradation of any reaction products comprised only of ribonucleotides by RNase A.;
- 6. Incubation of emulsion at 60° C. to allow for probe hybridization to deoxypyrimidine-containing IVT products;
- 7. Injection of the emulsion into the microfluidics sorting chip, detection of fluorescence signals and droplet sorting; steps 4 to 7 are shown in FIG. 1.
- 8. Collecting sorted droplets that had fluorescence signals, breakage of resulting emulsion, cleanup of nucleic acids and recovery of T7 RNAP mutants-encoding genes: amplification via PCR; and
- 9. PCR product cloning and library preparation from plasmid DNA pool of selected T7 RNAP mutants.
After two rounds of selection under pressure to incorporate deoxypyrimidines, 2.7% of unique mutants have increased in frequency compared to their frequencies in the initial mutant library of Example 1 (See Table 1).
changes in initial mutant library composition after selection of T7 RNAP mutants with altered substrate specificity | |
Frequency, % | |
Eliminated unique mutant | 20.5 |
Unique mutant where frequency decreased | 49.8 |
Unique mutant where frequency remained the samea | 27.1 |
Unique mutant where frequency increasedb | 2.7 |
aup to 2X enrichment. | |
bmore than 2X enrichment. |
Also, three mutations resulted in the frequency above 1% in the final library: Y639F (~11%), V783M (~4%), V783L (~1%). These mutations were enriched by 72X, 54X and 9X respectively after selection (FIG. 2).
The present system proved to be precise and highly efficient for sorting out a Y639F mutant of T7 RNAP, which is described in the art (See Sousa and Padilla, 1995). The frequency of the Y639F mutant in the mutant library was ~11% after two sorting cycles. Y639F mutant was enriched 72-fold compared to its frequency in the initial T7 RNAP mutant library.
Further, the mutations that were enriched no less than ~7X as shown in FIG. 2 (no more than 10X lower enrichment compared to the enrichment of the Y639F mutant) were selected for further evaluation.
To evaluate if individual T7 RNAP mutants exhibit expanded substrate range, expression in E. coli was performed.
Vector pBAD-T7RNAP-wt comprising the wild type T7 RNAP gene sequence was used for site directed mutagenesis of T7 RNAP gene. Site specific mutagenesis was performed using Phusion Site-Directed Mutagenesis Kit (Thermo Scientific, F541) following standard workflow. 5′-phosphorylation of oligonucleotides prior PCR was also performed as it is described in Phusion™ Site-Directed Mutagenesis Kit manual. Mutagenic primers used for introduction of chosen point mutations:
- SEQ ID NO: 10 and SEQ ID NO: 11 primers for V783M mutation,
- SEQ ID NO: 12 and SEQ ID NO: 13 primers for V689Q mutation,
- SEQ ID NO: 14 and SEQ ID NO: 15 primers for V783L mutation, and
- SEQ ID NO: 16 and SEQ ID NO: 17 primers for G555L mutation.
After PCR amplification of the pBAD-T7RNAP-wt plasmid DNA to be mutated, parental methylated and hemimethylated DNA was digested with FastDigest DpnI (Thermo Scientific, FD1703). After template DNA digestion, PCR product of approximately 6.8 kbp was purified from 1% agarose gel using GeneJET Gel Extraction Kit (Thermo Scientific, K0691) according to the standard protocol. Circularization of mutated PCR products was performed by ligation with T4 DNA Ligase using Rapid DNA Ligation Kit (Thermo Scientific, K1423) according to the standard protocol (30 ng of mutated PCR product was used for each reaction). Competent E. coli DH10B cells were prepared and transformed with ligation mixture following standard calcium chloride heat-shock transformation techniques (See Hanahan J Mol Biol 166:557-580 (1983)). Half of the ligation mixture were used for transformation. After the cell recovery period at 37° C. for 30 min, half of the transformation mixture were plated on LB medium supplemented with 100 µg/ml of ampicillin. Plates were incubated overnight at 37° C. Plasmid DNA identity of eight colonies from each mutation were checked via colony PCR using DreamTaq Green PCR Master Mix (Thermo Scientific, K1081) Three positive clones for each mutation were picked and cultured overnight in LB medium supplemented with 100 µg/ml of ampicillin at 37° C. Plasmid DNA was purified using GeneJET Plasmid Miniprep Kit (Thermo Scientific, K0502). Sequences of T7 RNAP gene mutants were determined by Sanger sequencing using BigDye™ Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, 4337457) and BigDye Xterminator™ Purification Kit (Applied Biosystems, 4376484) following standard protocols. Sanger sequencing confirmed which clones have correct desired mutations. Hence, expression vectors containing mutated T7 RNAP gene mutants (V783M, V689Q, V783L and G555L) were constructed: pBAD-T7RNAP-V783M, pBAD-T7RNAP-V689Q, pBAD-T7RNAP-V783L, pBAD-T7RNAP-G555L.
Phusion Site-Directed Mutagenesis Kit (Thermo Scientific, F541) and primers that have repetitive histidine codons (CATCACCATCACCATCAC, SEQ ID NO: 18) directly upstream of the START codon of the T7 RNAP gene were used for construction of plasmid DNA vectors. Standard workflow was followed, and 30 PCR cycles were performed. 5′-phosphorylation of oligonucleotides prior to PCR was performed as described in Phusion™ Site-Directed Mutagenesis Kit manual. Mutagenic primers used for introduction of desired insertion were SEQ ID NO: 19 and SEQ ID NO: 20. After PCR, all the steps were the same as described in the Example 3A.
Sequences of T7 RNAP gene mutants that have six repetitive histidine codons right upstream of the START codon were determined by Sanger sequencing using BigDye™ Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, 4337457) and BigDye Xterminator™ Purification Kit (Applied Biosystems, 4376484) following standard protocols. Expression vectors containing T7 RNAP mutants (V783M, V689Q, V783L and G555L) which have six repetitive histidine codons right next to the START codon of T7 RNAP gene were constructed: pBAD-T7RNAP-V783Mhis, pBAD-T7RNAP-V689Qhis, pBAD-T7RNAP-V783LHis, and pBAD-T7RNAP-G555LHis.
E. coli ER2566 cells were transformed with pBAD-T7RNAP-V783Mhis, pBAD-T7RNAP-V689Qhis, pBAD-T7RNAP-V783LHis, pBAD-T7RNAP-G555LHis and pBAD-T7RNAP-wtHis plasmid vectors. Competent E. coli ER2556 cells were prepared and transformed with plasmid vectors following standard calcium chloride heat-shock transformation techniques (See Hanahan 1983). 10 ng of each plasmid DNA were used for transformation. After the cell recovery period at 37° C. for 30 min, ⅒ of the transformation mixture were plated on LB medium supplemented with 100 µg/ml of ampicillin. Plates were incubated overnight at 37° C.
Single colony from each transformation plate were picked and cultured overnight in LB medium supplemented with 100 µg/ml ampicillin at 37° C., 220 rpm. 1/100 of overnight culture was transferred to fresh LB medium containing 100 µg/mL of ampicillin. Cells were cultivated in one liter of LB medium at 37° C. 220 rpm until optical density (OD600) of 0.7-0.8 was reached, and L-arabinose was added to 10 mM of final concentration. T7 RNAP expression was performed for 4 h at 37° C. 220 rpm. After protein expression, cultures were centrifuged at 6000 x g 4° C. for 30 min.
Purification of T7 RNAP mutants was performed by first resuspending prepared E. coli biomass in lysis buffer (50 mM Tris-HCl, 300 mM NaCl, 10 mM imidazole, 10 mM β-mercaptoethanol, 1 mg/ml lysozyme and 0.5 mM phenylmethylsulfonyl fluoride solution in water with pH 8 at 4° C. temperature). Cells were then lysed by sonication. Lysate was centrifuged at 16000 × g 4° C. for 30 min, and supernatant (soluble proteins) was collected.
T7 RNAP mutants were purified using HisPur™ Ni-NTA Superflow Agarose (Thermo Scientific, 25215) following standard instructions in the manual. Pierce™ Disposable Columns (Thermo Scientific, 29924) were used. Ratio of biomass and Ni-NTA Superflow Resin amount was from 1:2 to 1:4 (e.g., for protein purification from 1 g of biomass, 2-4 ml of Ni-NTA Superflow Resin were used). Composition of wash buffer (pH 8 at 4° C. temperature) used was 50 mM Tris-HCl, 300 mM NaCl, 20 mM imidazole, 10 mM β-mercaptoethanol, 0.1% (v/v) Triton X-100. Composition of elution buffer (pH 8 at 4° C. temperature) used was 50 mM Tris-HCl, 300 mM NaCl, 250 mM imidazole, 10 mM β-mercaptoethanol, 0.1% (v/v) Triton X-100.
After elution all the fractions from protein purification process were analysed by SDS-PAGE, and elution fractions containing the proteins of interest were selected. The combined elution fractions of each of the proteins, respectively, were transferred to T7 RNAP storage buffer (50 mM Tris-HCl (pH 8.0), 150 mM NaCl, 5 mM DTT, 0.03% (v/v) ELUGENT Detergent and 50% (v/v) glycerol). Dialysis was performed overnight at 4° C. using SnakeSkin™ Dialysis Tubing (Thermo Scientific, 88244). 0.1 mg/ml BSA was added to protein samples after dialysis. Purity of the proteins was more than 80%.
Polymerization activity was measured by performing in vitro transcription (IVT) with [3H] labelled ATP. Definition of activity unit was one unit of the enzyme incorporates 1 nmol of AMP into a polynucleotide fraction in 60 minutes at 37° C.
To evaluate the effects of V783M, V783L, G555L or V689Q mutation on T7 RNAP substrate specificity, in vitro transcription was performed.
Composition of in vitro transcription reaction was:
- Water, nuclease-free (Thermo Scientific, R0581) to 20 µL;
- 1.25 µL of RiboLock Rnase Inhibitor (40 U/µL) (Thermo Scientific, E00381);
- 1 µL of Pyrophosphatase, inorganic (0.1 U/µL) (Thermo Scientific, EF0221);
- 1.6 µL of DMSO, Anhydrous (Invitrogen, D12345);
- 1 µL of dCTP Solution (100 mM) (Thermo Scientific, R0151) or 20 µL of CTP Solution, Tris-buffered (100 mM) (Thermo Scientific, R1451) - depending on the reaction;
- 1 µL of dTTP Solution (100 mM) (Thermo Scientific, R0171) or 20 µL ofUTP solution, Tris buffered (100 mM) (Thermo Scientific, R1471) - depending on the reaction;
- 1 µL of ATP Solution, Tris buffered (100 mM) (Thermo Scientific, R1441) or 1 µL of dATP Solution (100 mM) (Thermo Scientific, R0141) - depending on the reaction;
- 1 µL of GTP Solution, Tris buffered (100 mM) (Thermo Scientific, R1461) or 1 µL dGTP Solution (100 mM) (Thermo Scientific, R0161) or 1 µL of 2′-F-dGTP (100 mM) (TriLink Biotechnologies, N-1009) - depending on the reaction;
- 4 µL of 5X TranscriptAid Reaction Buffer from TranscriptAid T7 High Yield Transcription Kit (Thermo Scientific, K0441);
- ~1 µg of linearized and purified pTZ19R DNA (Thermo Scientific, SD0141). Prior in vitro transcription pTZ19R DNA (Thermo Scientific, SD0141) was digested with FastDigest SmaI (Thermo Scientific, FD0663) according to the standard reaction conditions. Restriction reaction was purified using GeneJET Gel Extraction and DNA Cleanup Micro Kit (Thermo Scientific, K0831);
- 50 U of purified T7 RNA polymerases (wt, V783M, V783L, G555L or V689Q). Also, wt T7 RNA polymerase without His tag was used (Thermo Scientific, EP0113).
In vitro transcription reactions were incubated at 37° C. for 4.5 hours. Next, template DNA was digested with Lambda exonuclease (Thermo Scientific, EN0561) at 37° C. for 10 min, then the reaction was stopped by adding 4 µL of 0.5 M EDTA. In vitro transcription products were purified using AMPure XP magnetic beads (Beckman Coulter, A63882). Subsequently, 88 µL of AMPure XP magnetic beads was added, and samples were incubated for 15 minutes at room temperature. Elution was performed in 15 µL of nuclease-free water (Thermo Scientific, R0581) at 65° C. for 5 min. Purified products of single-stranded nucleic acids were incubated at 70° C. for 2 min to eliminate secondary structures.
Purified samples were diluted 5-200 times with water, and 11 µL of each diluted sample was analysed on Agilent™ 2100 Bioanalyzer using Agilent Small RNA Kit (Agilent Technologies, 5067-1548). Since template DNA was linearized with SmaI restriction endonuclease, in vitro transcription product length should be 42 nucleotide sequence corresponding to SEQ ID NO: 21, comprised of ribo- and/or deoxyribonucleotides, respectively (depending on a mixture of nucleoside triphosphates used in in vitro transcription reaction). Five different mixtures of nucleoside triphosphates were used for this experiment:
- dTTP, dCTP, ATP, and GTP;
- dTTP, CTP, ATP, and dGTP;
- dTTP, dCTP, dATP, and GTP;
- dTTP, dCTP, dATP, and 2′-F-dGTP; and
- UTP, CTP, ATP, and GTP.
The results of this experiment are provided in Agilent 2100 Bioanalyzer electropherograms (FIGS. 3A-6E). and summarized in Table 2, where activity of T7 RNAP mutants is provided as compared to wt T7 RNAP activity using different nucleotide substrates in an in vitro transcription reaction. Values indicate a yield ratio of the amount of full-length target transcript produced by a particular T7 mutant polymerase compared to the amount of the transcript produced by wt T7 RNAP under the same conditions. Full-length transcript yield is determined from Agilent Small RNA electropherogram data (target peak concentration). If no target product detected, its concentration considered as a lowest value of measurement unit of Agilent Bioanalyzer in Small RNA Kit. Such value is 0.01 pg/µL because peak concentration is measured with two decimal points in Small RNA Kit.
Comparison of T7 mutant polymerases versus WT T7 RNAP | ||||
Substrate mix | V783M | V783L | V689Q | G555L |
a) dTTP, dCTP, ATP, GTP | 5.2 | 2.5 | 1.2 | 0.8 |
b) dTTP, CTP, ATP, dGTP | 289.4 | 22.7 | 91.8 | 253.8 |
c) dTTP, dCTP, dATP, GTP | 27.8 | 1.0 | 1.0 | 1.0 |
d) dTTP, dCTP, dATP, 2′-F-dGTP | 316.0 | 1.0 | 73.7 | 137.5 |
Results with positive control samples (RNA synthesis, where UTP, CTP, ATP, and GTP mixture was used) are shown in electropherograms in FIGS. 3E, 4E, 5E and 6E, respectively.
V783M T7 mutant polymerase synthesizes full-length 42 nt product using all tested substrate nucleotide combinations, and the efficiency of incorporation of sugar modified substrate is significantly higher compared to wt T7 RNA polymerase (see Table 2). V783M T7 mutant polymerase synthesizes 5.2-316.0 times more target transcript compared to WT T7 RNA polymerase.
V783L T7 mutant polymerase exhibits ability to use deoxyribonucleoside triphosphates (e.g. dTTP, dGTP) instead of ribonucleoside triphosphates (e.g., UTP, GTP) for nucleic acid synthesis. Target transcript yield is respectively 2.5 and 22.7 times higher compared to WT T7 RNA polymerase.
V689Q and G555L T7 polymerase mutants synthesize full-length ssDNA using 2′-F-dGTP instead of rGTP 73.7-times and 137.5-times, respectively, more efficiently compared to WT T7 RNA polymerase. Also, the yield of the target transcript is 91.8-times higher with the V689Q mutant and 253.8-times higher with G555L mutant compared to WT T7 RNA polymerase when dTTP, CTP, ATP, dGTP are used as substrates.
Thus, all the analysed mutants of T7 RNA polymerase (including T7 mutant polymerases comprising V783M, V783L, V689Q or G555L mutations) have an expanded substrate range compared to WT T7 RNA polymerase and can incorporate deoxyribonucleotides more efficiently as compared to wt T7 RNA polymerase. Additionally tested T7 mutant polymerases comprising V783I or G555Y mutations, respectively, showed expanded substrate range compared to WT T7 RNA polymerase and could incorporate at least dCTP, dTTP, 2′-F-dCTP, 2′-F-dTTP and 2′-F-dGTP nucleotides.
T7 mutant polymerases that can incorporate dNTPs could be used as tools for de novo synthesis of oligonucleotides that may serve, for example, as primers for PCR.
In this experiment, six different mixtures of nucleoside triphosphates were used for in vitro transcription reaction:
- a) UTP, CTP, ATP, and GTP;
- b) dTTP, dCTP, ATP, and GTP;
- c) dTTP, dCTP, dATP, and GTP;
- d) dTTP, dCTP, dATP, and 2′-F-dGTP;
- e) dUTP, dCTP, ATP, and GTP; and
- f) dUTP, dCTP, dATP, and GTP.
Oligonucleotides with a 42 nucleotide sequence corresponding to SEQ ID NO: 21, comprised of ribo- and/or deoxyribonucleotides, respectively (depending on a mixture of nucleoside triphosphates used) were synthesized enzymatically via in vitro transcription. The in vitro transcription reactions comprised:
- Water, nuclease-free (Thermo Scientific, R0581) to 200 µL,
- 12.5 µL of RiboLock Rnase Inhibitor (40 U/µL) (Thermo Scientific, EO0381),
- 10 µL of Pyrophosphatase, inorganic (0.1 U/µL) (Thermo Scientific, EF0221),
- 16 µL of DMSO, Anhydrous (Invitrogen, D12345),
- 20 µL of dCTP Solution (100 mM) (Thermo Scientific, R0151) or 20 µL of CTP Solution, Tris-buffered (100 mM) (Thermo Scientific, R1451),
- 20 µL of dTTP Solution (100 mM) (Thermo Scientific, R0171) or 20 µL of UTP solution, Tris buffered (100 mM) (Thermo Scientific, R1471) or dUTP Solution (100 mM) (Thermo Scientific, R0133),
- 20 µL of ATP Solution, Tris buffered (100 mM) (Thermo Scientific, R1441) or 20 µL of dATP Solution (100 mM) (Thermo Scientific, R0141), 20 µL of GTP Solution, Tris buffered (100 mM) (Thermo Scientific, R1461) or 20 µL dGTP Solution (100 mM) (Thermo Scientific, R0161) or 1 µL of 2′-F-dGTP (100 mM) (TriLink Biotechnologies, N-1009),
- 40 µL of 5X TranscriptAid Reaction Buffer from TranscriptAid T7 High Yield Transcription Kit (Thermo Scientific, K0441) - reaction buffer was kept at room temperature,
- ~30 µg of linearized and purified pTZ19R DNA (Thermo Scientific, SD0141). Prior in vitro transcription pTZ19R DNA (Thermo Scientific, SD0141) was digested with FastDigest SmaI (Thermo Scientific, FD0663) and purified as described in Example 5, and
- 2000 U of purified T7 RNA polymerase mutant V783M.
All frozen reaction components were thawed, mixed, and centrifuged briefly to collect all drops. RiboLock Rnase Inhibitor, Pyrophosphatase, dCTP, dTTP, dATP, ATP, GTP, dGTP, 2′-F-dGTP and T7 polymerase mutants were kept on ice. Water, DMSO, 5X TranscriptAid Reaction Buffer, and pTZ19R/SmaI DNA were kept at room temperature. Reaction components were combined at room temperature in the order listed above (i.e., water was added first, and T7 RNA polymerase was added last). In vitro transcription reactions were incubated at 37° C. for 5 hours.
Next, template DNA was digested with Lambda exonuclease. 160 µL of water, nuclease-free (Thermo Scientific, R0581), 20 µL of 10X Reaction Buffer and 20 µL of Lambda Exonuclease (10 U/µL) (Thermo Scientific, EN0561) were directly added to the in vitro transcription reaction mixture. Template digestion was performed at 37° C. for 15 min. Reaction was stopped by the addition 40 µL of 0.5 M EDTA. In vitro transcription products were purified using AMPure XP magnetic beads (Beckman Coulter, A63882) following standard PCR purification protocol, except for AMPure XP volume used, incubation time, and elution. After IVT and template DNA digestion, 880 µL of AMPure XP magnetic beads and 880 µL of 96% ethanol were added, and mixed samples were incubated for 15 minutes at room temperature. Elution was performed in 100-150 µL of nuclease-free water (Thermo Scientific, R0581) at 65° C. for 5 min. Purified synthesized oligonucleotides were incubated at 70° C. for 2 min to eliminate secondary structures. After heat denaturation, samples were kept on ice or stored at -70° C.
Purified samples were diluted 500-1000 times in water, and 1 µL of each diluted sample were analyzed on Agilent™ 2100 Bioanalyzer using Agilent Small RNA Kit (Agilent Technologies, 5067-1548), where distinct peaks of the reaction products were observed. Because template DNA was linearized with SmaI restriction endonuclease, in vitro transcription products -oligonucleotides should be 42 bases.
The concentration of oligonucleotides was measured using NanoDrop™ spectrophotometer (Thermo Scientific). Next, the oligonucleotides were used to prepare primers for PCR. A chemically synthesized forward primer (synthesized by Metabion International AG) having SEQ ID NO: 21 and comprised of deoxyribonucleotides was used as a positive control reaction. The concentration of oligonucleotides was measured using NanoDrop™ spectrophotometer (Thermo Scientific).
The composition of the PCR reaction mixtures was as follows:
- 25 µL of a selected PCR Master Mix
- 19 µL of Water, nuclease-free (Thermo Scientific, R0581)
- 5 ng of pTZ19R (Thermo Scientific, SD0141).
- 2.5 µL of chemically synthesized reverse primer of SEQ ID NO: 22 (10 µM).
- 2.5 µL of 10 µM enzymatically synthesized primer comprised of nucleotides as listed in a, b, c, d, e or f options; or a chemically synthesized forward primer of SEQ ID NO: 21, comprised of deoxyribonucleotides.
Selected PCR Master Mixes and respective cycling conditions were as shown in the data in FIGS. 7A-7E:
- FIG. 7A: DreamTaq PCR Master Mix (2X) (Thermo Scientific, K1071). Cycling conditions: 1 cycle of initial denaturation at 95° C. for 2 min; 30 cycles of denaturation at 95° C. for 30 s, annealing at 53° C. for 30 s and extension at 72° C. for 1 min; 1 cycle of final extension at 72° C. for 5 min.
- FIG. 7B: Platinum II Hot-Start PCR Master Mix (2X) (Invitrogen, 14000014). Cycling conditions: 1 cycle of initial denaturation at 94° C. for 2 min; 30 cycles of denaturation at 94° C. for 30 s, annealing at 60° C. for 30 s and extension at 68° C. for 30 s.
- FIG. 7C: Platinum SuperFi PCR Master Mix (Invitrogen, 12358010). Cycling conditions: 1 cycle of initial denaturation at 98° C. for 30 s; 30 cycles of denaturation at 98° C. for 30 s, annealing at 62° C. for 30 s and extension at 72° C. for 30 s; 1 cycle of final extension at 72° C. for 5 min.
- FIG. 7D: Phusion Hot Start II High-Fidelity PCR Master Mix (Thermo Scientific, F565L). Cycling conditions: 1 cycle of initial denaturation at 98° C. for 30 s; 30 cycles of denaturation at 98° C. for 30 s, annealing at 62° C. for 30 s and extension at 72° C. for 30 s; 1 cycle of final extension at 72° C. for 5 min.
- FIG. 7E: Phusion U Multiplex PCR Master Mix (Thermo Scientific, F562L). Cycling conditions: 1 cycle of initial denaturation at 98° C. for 30 s; 30 cycles of denaturation at 98° C. for 30 s, annealing at 62° C. for 30 s and extension at 72° C. for 30 s; 1 cycle of final extension at 72° C. for 5 min.
After PCR, 5 µL of each sample was analysed by standard agarose gel electrophoresis (1% agarose, 1X Tris-Acetate-EDTA, 0.25 µg/mL of Ethidium bromide). Prior to electrophoresis, 5 µL of PCR products were mixed DNA Gel Loading Dye (6X) (Thermo Scientific, R0611) following general recommendations for DNA electrophoresis. The results of this experiment are shown in agarose gel electrophoresis images. As can be seen from FIGS. 7A-7E, target PCR products of 225 bp were synthesized, when various chimeric oligonucleotides were used. Thus, chimeric oligonucleotides (i.e., comprising deoxyribonucleotides and ribonucleotides, as DNA/RNA) synthesized by a T7 mutant polymerase can serve as primers in PCR. There are some yield differences depending on the type of enzymatically synthesized primer that were used for PCR. The more deoxyribonucleotides a primer comprised, the higher yield of target PCR product was obtained (i.e., higher yields of target product were obtained when oligonucleotide comprised of dTTP, dCTP, dATP, and GTP were used compared to oligonucleotide comprising of dTTP, dCTP, ATP, and GTP). Also, chimeric oligonucleotides comprising of dUTP instead of dTTP cannot serve as primers when proofreading polymerases are used (FIGS. 7C-7D) for PCR. Such polymerases by nature are not able to incorporate dUTP and read through uracil present in template. On the contrary, Taq-based DNA polymerases (FIGS. 7A-7B) and engineered proofreading DNA polymerase (Figure E) could use oligonucleotides comprising dUTP as primers, and thus synthesized target PCR products of 225 bp (FIGS. 7A-7B and 7E).
PCR products were purified from 1% agarose gel using GeneJET Gel Extraction Kit (Thermo Scientific, K0691) according to the standard protocol. Amplicon libraries were prepared using Collibri™ PCR-free PS DNA Library Prep Kit for Illumina Systems, with UD indexes (Set B, 25-48) (Invitrogen, A43608024). Size distribution and quality of prepared amplicon libraries were verified by performing capillary electrophoresis analysis on Agilent™ 2100 Bioanalyzer instrument using the Agilent High Sensitivity DNA Kit (Agilent Technologies, 5067-4626). Before proceeding to sequencing prepared libraries were quantified by qPCR using Collibri™ Library Quantification Kit (Invitrogen, A38524100). The resulting libraries were sequenced on the Illumina MiSeq™ using the MiSeq Reagent Kit v2 (Illumina, CA, USA, MS-102-2002), 300-cycles (Illumina, CA, USA); 2 × 151 bp paired-end reads were performed. Libraries were sequenced at average depth of 0.1 M reads/sample. Data analysis revealed the target amplicon sequences in all cases, and the majority of generated reads (≥92%) aligned to a reference sequence. Ratio of mapped reads for each of the oligos was comparable to the ratio of mapped reads for PCR with chemically synthesized oligonucleotide. Thus, specific products are produced when chimeric oligonucleotides synthesized by a mutant T7 RNA polymerase are used in an amplification reaction. In addition, the amount of insertions/deletions was substantially lower in regions covered by enzymatically synthesized GTP or 2′-F-dGTP containing primers.
The enzymatic synthesis methods employing mutant polymerases as described herein and throughout the current disclosure advantageously provides for high-quality, simple and cheap oligonucleotide and polynucleotide synthesis.
T7 mutant polymerases capable of incorporating dNTPs can be used as tools for de novo synthesis of primers for reverse transcription. An experiment having a workflow corresponding to the scheme provided in FIG. 8 was performed.
pTZ19R plasmid derivative with a sequence of PCR handle no. 2 (SEQ ID NO: 24) and oligo (dT)24VN (SEQ ID NO: 69) (where V is A or G; N is A, T, C or G) directly downstream of T7 promoter sequence was constructed. Plasmid DNA library was generated by performing site-directed insertional mutagenesis. Further, the template for in vitro transcription was prepared as follows:
- Linearization and blunting of constructed plasmid DNA pool - Mva1269I restriction endonuclease and T4 DNA polymerase were used; and
- Linearized and end repaired plasmid DNA pool was purified as described in previous examples.
In vitro transcription was performed using T7 RNA polymerase mutant V783M. Provided nucleotide substrates were a mixture of dTTP, dCTP, dATP and GTP or a mixture of dTTP, dCTP, dATP and 2′-F-dGTP. After in vitro transcription, a chimeric anchored oligo(dT) (comprising sequence corresponding to SEQ ID NO: 45, and containing ribonucleotides or modified deoxyribonucleotides, depending on nucleotide triphosphates used during in vitro transcription) was generated. The synthesized reverse transcription primers at the 5′-end had a PCR handle No 2 sequence for introduction of full length P7 Illumina adapter. Next, template DNA was digested with λ exonuclease, and in vitro transcription products were purified using magnetic beads.
A 3′ mRNA-Seq library preparation was performed using either enzymatically or chemically synthesized anchored oligo(dT) primer (SEQ ID NO: 45) and ERCC Ex-Fold Mix2 (Invitrogen, #4456739) as an input mRNA. An oligonucleotide-tethered ddCTP (ddCTP-(Alxyl)-NNNNNNNNAGATCGGAAGAGCGTCGTGTA-3′-biotin, SEQ ID NO: 46) was used for random reverse transcription termination and tagging with PCR handle No 1 (SEQ ID NO: 23). SuperScript IV reverse transcriptase (Invitrogen) was used; the ratio of oligonucleotide-tethered ddCTP and dCTP was 1:20. Reverse transcription products were purified using streptavidin magnetic beads. Streptavidin magnetic beads ensured purification of cDNA tagged at 3′-end because the oligonucleotide-tethered ddNTP also comprised a biotin tag.
Introduction of full-length adapter sequences was performed via indexing PCR with Collibri Stranded RNA Library Prep Kit (Invitrogen, A38994024). During indexing PCR, products of reverse transcription were amplified and barcoded. Libraries were purified using magnetic beads; size distribution and quality of prepared libraries verified by performing capillary electrophoresis analysis on Agilent™ 2100 Bioanalyzer instrument. Before proceeding to sequencing, the prepared libraries were quantified by qPCR with Collibri Library Quantification Kit (Invitrogen, A38524500).
The resulting libraries were sequenced on the Illumina MiSeq™ using the MiSeq Reagent Kit v2, 300-cycles; sequencing was performed at single-read mode, 109 bp R1. Samples were sequenced at 0.1 M reads/sample depth.
Data analysis revealed a ratio of mapped reads: 92-96%. The library that was prepared using chemically synthesized anchored oligo(dT) primer resulted in comparable alignment rate - 99%. All mapped reads in all samples have an expected ERCC-spike biotype. Obtained strand-specificity in all samples was 99.9-100%. Enzymatically synthesized oligo(dT) as well as chemically synthesized oligo(dT) primed reverse transcription reaction ensured specific mRNA capture in all cases. Detected genes counts and ERCC linearity did not differ meaningfully if enzymatically or chemically synthesized reverse transcription primer were used. These results confirm that chimeric oligonucleotide synthesized by a T7 RNA mutant polymerase can be used in the priming-based reaction of reverse transcription.
DNA amplification and random termination/tagging via in vitro transcription using a T7 RNA mutant polymerase was tested, using the protocol shown in FIG. 9A.
In vitro transcription was performed using T7 RNA polymerase mutant V783M. Intact pTZ19R plasmid DNA bearing a PCR handle sequence no. 2 (SEQ ID NO: 24) directly downstream of a T7 promoter was used as a template DNA. Provided nucleotide substrates for in vitro transcription were either a mixture of dTTP, dTCP, dATP and GTP or a mixture of dTTP, dCTP, dATP and 2′-F-dGTP. Also, synthesis was randomly terminated and tagged using oligonucleotide-tethered ddUTP (ddUTP-(Alxyl)- NNNNNNNNAGATCGGAAGAGCGTCGTGTA-3′-biotin, SEQ ID NO: 46) or oligonucleotide-tethered ddCTP (ddCTP-(Alxyl)-NNNNNNNNAGATCGGAAGAGCGTCGTGTA-3′-biotin, SEQ ID NO: 46)). A ratio of oligonucleotide-tethered ddNTP and respective dNTP was either 1:500 or 1:5000. The oligonucleotide-tethered ddNTPs used have a PCR handle sequence no. 1 (SEQ ID NO: 23). After in vitro transcription, ssDNA fragments tagged at both ends were generated. Next, template DNA was digested with λ exonuclease and in vitro transcription products were purified using streptavidin magnetic beads. Streptavidin magnetic beads ensured purification of fragments that were tagged at 3′ end because the oligonucleotide-tethered ddNTPs comprised a biotin tag.
Introduction of full-length adapter sequences was performed via indexing PCR as in previous example. During indexing, PCR products of in vitro transcription were amplified and barcoded. Libraries were purified using magnetic beads; size distribution and quality of prepared libraries was verified by performing capillary electrophoresis analysis on Agilent™ 2100 Bioanalyzer instrument. Before proceeding to sequencing, the prepared library was quantified by qPCR as in previous example.
The resulting libraries were sequenced on the Illumina MiSeq™ using the MiSeq Reagent Kit v2 Nano, 300-cycles; sequencing was performed at single-read mode, 109 bp R1. Samples were sequenced at 0.1 M reads/sample depth. Data analysis confirmed that a T7 mutant RNAP can use as a nucleotide substrate and incorporate oligonucleotide-tethered ddNTP. Generated reads had a correct structure as shown in FIG. 9B:
- first 8 nucleotides represent unique molecular identifier (UMI) region,
- 9th nucleotide represents ddNTP incorporation site (almost all reads at that position have the same nucleotide), and
- 10-109th positions indicate target plasmid DNA sequence.
The majority of generated reads (≥90%) aligned to a reference sequence. Coverage across the reference sequence indicated that all reads covered a template sequence under T7 promoter (FIG. 9C). As expected, the highest sequencing depth was observed near the transcription start. Also, as expected, termination by incorporation of oligonucleotide-tethered dideoxynucleotides was random and thus, transcription termination position varied.
These results demonstrate that a T7 mutant polymerase described herein can be used for nucleic acid amplification and random termination/tagging via in vitro transcription. Linear amplification directly into ssDNA or into chimeric RNA/DNA nucleic acid form that is tolerated by DNA polymerases and is directly amplified may improve various workflows by omitting the step of RNA to DNA conversion. Synthesis of ssDNA from the artificially added T7 promoter sequence enables isothermal amplification of DNA of individual targets or even whole genomes.
A multiplexed enzymatic oligonucleotide synthesis can be performed, for example, according to the principal workflow as shown in FIG. 10.
11 pairs of primers for amplification of specific fragments from bacteriophage λ genomic DNA (GenBank: J02459.1) were synthesized. Chosen targets were tail and tail tip genes of λ phage: Z, U, V, G, T, H, M, L, K, I and J (estimated amplicon length of 234-271 bp).
Plasmid DNA pools that would serve as IVT templates for multiplexed enzymatic primer synthesis were constructed. Pools having templates for forward and reverse primers were generated separately. In vitro transcription templates were constructed to have a specific sequence downstream of T7 promoter: a sequence of a PCR-handle (PCR handle 1 in forward primers, and PCR handle 2 in reverse primers) and a sequence complementary to one of the 22 target sequences. As a control, chemically synthesized primers having the same sequences as would be synthesized enzymatically were used (SEQ ID Nos: 47-68).
Next, transformation of E. coli with the plasmid pools and collection of E. coli colony pools were performed. After purification of plasmid DNA libraries from E. coli pools, templates for in vitro transcription were prepared as follows:
- Linearization and blunting of constructed plasmid DNA pools; restriction endonuclease and T4 DNA polymerase were used, and
- Linearized and end repaired plasmid DNA pools were purified.
In vitro transcription was performed using T7 mutant RNA polymerase V783M. Provided substrate for in vitro transcription were mixture of 1) dTTP, dTCP, dATP and GTP or 2) dTTP, dCTP, dATP and 2′-F-dGTP. These primers at its 5′-end have a PCR handle sequences. Next, template DNA was digested with λ exonuclease and in vitro transcription products were purified using magnetic beads.
Synthesized primer pools (comprising sequences corresponding to SEQ ID Nos: 47-68; in case of enzymatically synthesized primers, also containing ribonucleotides or modified deoxyribonucleotides, depending on nucleotide triphosphates used during in vitro transcription) were used for amplification of λ genomic DNA target sequences. After multiplexed PCR and treatment of exonuclease I, reaction products were cleaned up using magnetic beads. Barcoding and introduction of full-length adapter sequences was performed via indexing PCR as in previous examples. Amplicon libraries were purified using magnetic beads. Size and quality of prepared libraries were verified by performing capillary electrophoresis analysis on Agilent™ 2100 Bioanalyzer instrument. Before proceeding to sequencing the prepared libraries were quantified by qPCR as in previous examples.
The resulting libraries were sequenced on the Illumina MiSeq™ using the MiSeq Reagent Kit v2, 300-cycles; 2 × 151 bp paired-end reads were performed. Libraries were sequenced at average depth of 0.2 M reads/sample.
Data analysis revealed that all 11 targets were amplified. Thus, it is possible to produce multiple oligonucleotide sequences with a T7 mutant polymerase, and they can all be used in targeted amplification (multiplex) reaction. Multiplexed enzymatic primer synthesis using a T7 RNA mutant polymerase could be a unique offering for amplicon sequencing solutions.
Alignment was done to determine whether the amino acids that confer changes in activity of T7 RNAP mutants, such as G555, V689, and V783, are conserved in other RNAPs.
The program “MultiAlin” was used to align the sequence of a given RNAP with the T7 RNAP. MultiAlin is publicly available at www.multalin.toulouse.inra.fr/multalin/ and described in Corpet, 1988. Representative parameters included: symbol comparison table: blosum62; gap weight: 12; and gap length weight: 2.
As shown in FIG. 11, amino acids related to the mutation positions found during the screening for T7 RNAP mutants were conserved across the aligned RNA polymerases. Thus, generating mutations at equivalent positions to those found for the T7 RNAP may provide other RNAPs with altered substrate specificity.
To test the ability to incorporate anti reverse cap analog (ARCA) during co-transcriptional capping, in vitro transcription (IVT) was performed in the following reaction mixture: 1 µg of template DNA (pTZ19R/AanI), 7.5 mM CTP, 7.5 mM ATP, 7.5 mM dTTP, 1.5 mM dGTP, 6 mM ARCA (Thermo Fisher Scientific), 50 U RiboLock RNase Inhibitor (Thermo Scientific), 0.1 U Inorganic Pyrophosphatase (Thermo Scientific), 8% DMSO, 50 U T7 RNA polymerase V783M in 1X TranscriptAid reaction buffer. Reaction mixture was incubated at 37° C. for 4 h, and then treated with lambda exonuclease to remove template DNA. IVT products were purified using AMPure XP magnetic beads and analyzed on Agilent 2100 Bioanalyzer using Small RNA Kit. The efficiency of ARCA incorporation was comparable to that obtainable with wild type T7 RNAP and ribonucleotide triphosphates.
The foregoing written specification is considered to be sufficient to enable one skilled in the art to practice the embodiments. The foregoing description and Examples detail certain embodiments and describes the best mode contemplated by the inventors. It will be appreciated, however, that no matter how detailed the foregoing may appear in text, the embodiment may be practiced in many ways and should be construed in accordance with the appended claims and any equivalents thereof.
As used herein, the term about refers to a numeric value, including, for example, whole numbers, fractions, and percentages, whether or not explicitly indicated. The term about generally refers to a range of numerical values (e.g., +/-5-10% of the recited range) that one of ordinary skill in the art would consider equivalent to the recited value (e.g., having the same function or result). When terms such as at least and about precede a list of numerical values or ranges, the terms modify all of the values or ranges provided in the list. In some instances, the term about may include numerical values that are rounded to the nearest significant figure.