Skip to main content

Representation of Proteins with Posttranslational Modifications in the HL7 SPL Standard

  • Protocol
  • First Online:
Methods in Pharmacology and Toxicology

Abstract

The Health Level Seven (HL7) Structured Product Labeling (SPL) is an ANSI-accredited data exchange standard, which was adopted by the US Food and Drug Administration (FDA) for the exchange of health and regulatory product and facility data. We describe an extension of this standard for exchanging structural characteristics of substances used as ingredients in medicinal products, particularly in biopharmaceuticals. The chapter covers basics of the abstract SPL data model, its specialization for substances, and its further specialization for proteins with posttranslational modifications. The standard utilizes the XML syntax framework, which allows combining specialized substance-related standards, such as the IUPAC International Chemical Identifier (InChI), with coded terminologies and quantitative parameters important for substance identification. The key elements of the data model for substances are structural units connected in a specified manner or related to each other as mixtures. Small molecules are represented by chemical structures and are uniquely defined using InChI. Macromolecules are represented in two different ways depending on whether they were synthesized in a template-driven chemical/biochemical process (e.g., proteins synthesized on ribosomes) or in a non-template-driven process (e.g., synthetic polymers). In the case of proteins, the arrangement of repeating units is described using the conventional amino acid letter notation. In the case of synthetic polymers, the explicit chemical structures of repeating units are provided. Finally, layers of modifications to the chains are described consistently by substituting the standard structural repeating units with special structural units whose structures are provided in the same XML document. The InChI canonicalization algorithm and the InChI atom numbering schema are used to ensure that the relationships between structural units are represented canonically. Bridging “bioinformatical” and “chemoinformatical” approaches in this way allows describing structures of very complex biochemical objects such as proteins with posttranslational modifications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Sekhon BS (2010) Biopharmaceuticals: an overview. Thai J Pharm Sci 34:1–19

    Google Scholar 

  2. Martin-Liberal J, Ochoa de Olza M, Hierro C et al (2017) The expanding role of immunotherapy. Cancer Treat Rev 54:74–86

    Google Scholar 

  3. Ayyar BV, Arora S, O’Kennedy R (2016) Coming-of-age of antibodies in cancer therapeutics. Trends Pharmacol Sci 37:1009–1028

    Google Scholar 

  4. Lagassé HA, Alexaki A, Simhadri VL et al (2017) Recent advances in (therapeutic protein) drug development. F1000Res 6:113. https://doi.org/10.12688/f1000research.9970.1

    Article  Google Scholar 

  5. Declerck PJ (2012) Biologicals and biosimilars: a review of the science and its implications. Generics Biosimilars Initiative J 1:13–16

    Google Scholar 

  6. Government Publishing Office (2009) Licensure pathway for biosimilar biological products. https://www.gpo.gov/fdsys/pkg/BILLS-111hr1548ih/pdf/BILLS-111hr1548ih.pdf. Accessed 21 May 2018

  7. World Health Organization (2009) Guidelines on evaluation of similar biotherapeutic products (SBPs) http://www.who.int/biologicals/areas/biological_therapeutics/BIOTHERAPEUTICS_FOR_WEB_22APRIL2010.pdf. Accessed 21 May 2018

  8. European Medicines Agency (2005) Guideline on similar biological medicinal products. http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2009/09/WC500003517.pdf. Accessed 21 May 2018

  9. U.S. Food and Drug Administration (2015) Scientific considerations in demonstrating biosimilarity to a reference product. https://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM291128.pdf

  10. Zhang YJ, Luo L, Desai DD (2016) Overview on biotherapeutic proteins: impact on bioanalysis. Bioanalysis 8(1):1–9

    Google Scholar 

  11. Kia-Ki H, Martinage A (1992) Post-translational chemical modification(s) of proteins. Int J Biochem 24:19–28

    Google Scholar 

  12. Wiltschi B (2012) Expressed protein modifications: making synthetic proteins. Methods Mol Biol 813:211–225

    Google Scholar 

  13. Terasaka N, Iwane Y, Geiermann A-S et al (2015) Recent developments of engineered translational machineries for the incorporation of non-canonical amino acids into polypeptides. Int J Mol Sci 16:6513–6531

    Google Scholar 

  14. Turecek PL, Bossard MJ, Schoetens F, Ivens IA (2016) PEGylation of biopharmaceuticals: a review of chemistry and nonclinical safety information of approved drugs. J Pharm Sci 105:460–475

    Google Scholar 

  15. Gong Y, Leroux J-C, Gauthier MA (2015) Releasable conjugation of polymers to proteins. Bioconjug Chem 26:1172–1181

    Google Scholar 

  16. Bakhtiar R (2016) Antibody drug conjugates. Biotechnol Lett 38:1655–1664

    Google Scholar 

  17. Yao H, Jiang F, Lu A, Zhang G (2016) Methods to design and synthesize antibody-drug conjugates (ADCs). Int J Mol Sci. https://doi.org/10.3390/ijms17020194

    Google Scholar 

  18. ISO (2012) Health informatics—identification of medicinal products—data elements and structures for the unique identification and exchange of regulated information on substances. https://www.iso.org/obp/ui/#iso:std:iso:11238:ed-1:v1:en. Accessed 21 May 2018

  19. Health Level Seven International (2018) HL7 Version 3 Standard: Structured Product Labeling, Release 7 (SPL R7). http://www.hl7.org/implement/standards/product_brief.cfm?product_id=440. Accessed 21 May 2018

  20. Beeler GW, Huff S, Rishel W et al (1999) HL7 v3 message development framework. http://www.hl7.org/documentcenter/public_temp_CDB53D46-1C23-BA17-0C8F0442E243162F/wg/mnm/Mdf99.pdf. Accessed 21 May 2018

  21. Soley R and the OMG Staff Strategy Group (2000) Model driven architecture [White paper]. Object Management Group White. http://www.omg.org/mda/mda_files/model_driven_architecture.htm. Accessed 21 May 2018

  22. International Telecommunication Union (2002) Information technology—abstract syntax notation one (ASN.1): specification of basic notation [standard]. ITU-T Recommendation X.680. https://www.itu.int/ITU-T/studygroups/com17/languages/X.680-0207.pdf. Accessed 21 May 2018

  23. Rivest RL (1997) S-expressions. http://people.csail.mit.edu/rivest/Sexp.txt. Accessed 21 May 2018

  24. W3C (1998) Extensible Markup Language (XML) 1.0. https://www.w3.org/TR/1998/REC-xml-19980210. Accessed 21 May 2018

  25. Moss L (2008) Enterprise data modeling—is it worth it? EIMInsight Mag 2(1). http://www.eiminstitute.org/library/eimi-archives/volume-2-issue-1-april-2008-edition/enterprise-data-modeling-2013-is-it-worth-it. Accessed 21 May 2018

  26. Russler DC1, Schadow G, Mead C et al (1999) Influences of the Unified Service Action Model on the HL7 Reference Information Model. Proc AMIA Symp 1999:930–934

    Google Scholar 

  27. Fennell P (2014) Schematron—more useful than you’d thought. In: XML London 2014—conference proceedings, University College London, London, UK, 7–8 June 2014

    Google Scholar 

  28. W3C (2010) XML Path Language (XPath) 2.0. In: Berglund A, Boag S, Chamberlin D et al (eds) W3C recommendation 14 December 2010. http://www.w3.org/TR/2010/REC-xpath20-20101214/. Accessed 21 May 2018

  29. International Union of Pure and Applied Chemistry (2014) Gold book. http://goldbook.iupac.org/PDF/goldbook.pdf. Accessed 21 May 2018

  30. Sioutos N, de Coronado S, Haber MW et al (2007) NCI thesaurus: a semantic model integrating cancer-related clinical and molecular information. J Biomed Inform 40:30–43

    Google Scholar 

  31. Stein SE, Heller SR, Tchekhovskoi DV, Pletnev IV (2017) InChI version 1, software version 1.05. www.inchi-trust.org/download/105/INCHI-1-DOC.zip. Accessed 21 May 2018

  32. Hull SE, Barnard JM, Thomas DG (2011) InChI source code documentation. https://www.inchi-trust.org/downloads/. Accessed 21 May 2018

  33. Wilks ES (1997) Polymer nomenclature and structure: a comparison of systems used by CAS, IUPAC, MDL, and DuPont. 3. Comb/graft, cross-linked, and dendritic/hyperconnected/star polymers. J Chem Inf Comput Sci 37:209–223

    Google Scholar 

  34. The UniProt Consortium (2017) UniProt: the universal protein knowledgebase. Nucleic Acids Res. https://doi.org/10.1093/nar/gkw1099

  35. World Health Organization (2018) Essential medicines and health products. http://www.who.int/medicines/publications/druginformation/innlists/en/. Accessed 21 May 2018

  36. Wikipedia (2018) Desmosine. https://en.wikipedia.org/wiki/Desmosine. Accessed 21 May 2018

  37. UniProt (2018) UniProtKB—P15502 (ELN_HUMAN) http://www.uniprot.org/uniprot/P15502. Accessed 21 May 2018

  38. UniProt (2018) UniProtKB—P80025 (PERL_BOVIN). http://www.uniprot.org/uniprot/P80025. Accessed 21 May 2018

  39. FDA (2007) Food and Drug Administration Substance Registration System Standard Operating Procedure. https://www.fda.gov/downloads/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/ucm127743.pdf. Accessed 21 May 2018

  40. Rae TD, Goff HM (1996) Lactoperoxidase heme structure characterized by paramagnetic proton NMR spectroscopy. J Am Chem Soc 118:2103–2104

    Google Scholar 

  41. World Health Organization (2014) .Recommended international nonproprietary names: list 72. WHO Drug Inf 28(3):401

    Google Scholar 

  42. World Health Organization (2013) .Recommended international nonproprietary names: list 70. WHO Drug Inf 27(3):302

    Google Scholar 

  43. Gupta GS (2012) Lectican protein family. In: Gupta GS (ed) Animal lectins: form, function and clinical applications. Springer, Wien

    Google Scholar 

  44. UniProt (2018) UniProtKB—P16112 (PGCA_HUMAN) http://www.uniprot.org/uniprot/P16112. Accessed 21 May 2018

  45. Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29(6–7):476–488

    Google Scholar 

  46. Zhang T, Li H, Xi H et al (2012) HELM: a hierarchical notation language for complex biomolecule structure representation. J Chem Inf Model 52(10):2796–2806

    Google Scholar 

  47. U.S. Food and Drug Administration (2018) Substance Registration System—Unique Ingredient Identifier (UNII). https://www.fda.gov/ForIndustry/DataStandards/SubstanceRegistrationSystem-UniqueIngredientIdentifierUNII/default.htm. Accessed 21 May 2018

  48. Chemical Abstracts Service (2018) CAS registry and CAS registry number FAQs. https://support.cas.org/content/chemical-substances/faqs. Accessed 21 May 2018

  49. Kulmanov M, Khan MA, Hoehndorf R (2018) DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier. Bioinformatics 34(4):660–668

    Google Scholar 

Download references

Disclaimer

This publication targets the scientific chemoinformatics community only and should not be regarded as a guidance for industry.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yulia Borodina .

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Borodina, Y., Schadow, G. (2018). Representation of Proteins with Posttranslational Modifications in the HL7 SPL Standard. In: Methods in Pharmacology and Toxicology. Humana Press. https://doi.org/10.1007/7653_2018_31

Download citation

  • DOI: https://doi.org/10.1007/7653_2018_31

  • Published:

  • Publisher Name: Humana Press

Publish with us

Policies and ethics