Career in Chemoinformatics*
M Karthikeyan*, National Chemical Laboratory, Pune, INDIA
email: m.karthikeyan [at] ncl.res.in ; karthincl [at] gmail.com
Text Book : Practical Chemoinformatics , Springer 2014 (ISBN: 978-81-322-1779-4) http://www.springer.com/chemistry/book/978-81-322-1779-4
Ch 1. Open-Source Tools, Techniques, and Data in Chemoinformatics . M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 Pages 1-92 . DOI 10.1007/978-81-322-1780-0_1
Ch 2. Chemoinformatics Approach for the Design and Screening of focused virtual libraries M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 . Pages 93-131 DOI 10.1007/978-81-322-1780-0_2
Ch 3. Machine Learning Methods in Chemoinformatics for Drug Discovery M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 133-194 DOI 10.1007/978-81-322-1780-0_3
Ch 4. Docking and pharmacophore modeling for virtual screening M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 195-269 DOI 10.1007/978-81-322-1780-0_4
Ch 5. Active site directed pose prediction programs for efficient filtering of molecules M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 271-316 DOI 10.1007/978-81-322-1780-0_5
Ch 6. Representation, fingerprinting and modeling of chemical reactions. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 317-374 DOI 10.1007/978-81-322-1780-0_6
Ch 7. Predictive methods for Organic Spectral data Simulation. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 . Pages 375-414 DOI 10.1007/978-81-322-1780-0_7
Ch 8. Chemical Text mining for Lead Discovery. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 415-449. DOI 10.1007/978-81-322-1780-0_8
Ch 9. Integration of Automated Work flow in Chemoinformatics for drug discovery. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 451-499. DOI 10.1007/978-81-322-1780-0_9
Ch 10. Cloud computing Infrastructure development for Chemoinformatics. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 . Pages 501-528 .DOI 10.1007/978-81-322-1780-0_10
When two scientific disciplines meet, they can be mutually beneficial, fill each other’s voids – and complement each other, giving rise to unprecedented scientific opportunities. One such field of recent interest is chemoinformatics. Chemoinformatics plays a key role in areas as diverse as chemical genomics and drug discovery, the storage of chemical information in databases and the prediction of toxic substances. Today, these techniques are mostly used in pharmaceutical companies in the process of drug discovery, but also for example in “functional foods”, designed by nutritional companies to improve body functions, such as for example digestion or brain function.
While bioinformatics is known since 1976 which is defined as “the study of informatics process in biotic systems”, the emerging terminology in the pharmaceutical sector is commonly referred to as chemoinformatics, which is defined as the “mixing of information resources to transform data into information and information into knowledge, intending for better rapid decisions in the arena of drug lead identification and optimization”. Chemoinformatics is a generic term that encompasses the design, creation, organization, storage, management, retrieval, analysis, dissemination, visualization and the use of chemical information – so, virtually every area where “chemical data” is accessed or changed by means of computers.
Chemoinformatics represents a vital link between experiment and theory in the area of drug design, through the extraction of information from data and conversion into knowledge. With the explosion of publicly available genomic information, such as that resulting from the Human Genome Project, in the middle of the 1990s, bioinformatics has become very popular not only in the scientific community but also among the general audience. This has led to the coining of the counterpart of bioinformatics in chemistry after about two decades as Chemoinformatics. However this field can actually be seen as about two hundred years old – ever since the first account of chemical data has been published in literature. Today’s technology in chemoinformatics in fact facilitates better organization, storage, retrieval and analysis of these data for further advanced predicting studies – thus, saving time and money, also possibly animal experiments, and advancing humankind by developing novel, and safer, drugs.
The last three decades have seen tremendous growth in this field with the advancement in the computer technologies. Today volumes and volumes of books has been written on this subject and even few text books available for teaching in universities at the BSc and MSc level. Though there are full time Masters degree programs available in universities abroad, in India this field has yet to get full recognition. Currently chemoinformatics is being introduced as part of an ongoing diploma or masters program in bioinformatics in spite of its maturity as a new discipline. Besides the traditional mainstream areas of chemoinformatics such as :
several new research areas of chemoinformatics have appeared recently, such as
It is interesting to notice that at the end of 20th century almost all the major foundations and theories of chemistry had been well understood and established.
Chemistry has already evolved from largely a study of the elements to a study of molecules to currently a study of molecular interactions, especially those involving biological macromolecules – the molecules such as proteins and sugars we humans are made of. This offers a excellent opportunity for chemoinformatics to grow in this new direction. The main focus of recently identified “cyber enabled chemistry” by the US National Science Foundation is on the development of integrated databases, data mining tools, molecular visualization and computational capabilities and the remote and networked use of instrumentation.
The scope of this rapidly developing field will certainly continue to expand. It is worth mentioning that there is a new trend of integration of chemoinformatics with bioinformatics. This is because many sectors of the chemical and pharmaceutical industries are interdisciplinary by nature, and major progress and developments in those industries are occurring in both bioinformatics and chemoinformatics side by side. Chemists will become more and more computer dependent, Internet dependent and chemoinformatics dependent.
Chemoinformatics through its development in the past half a century, has reached in the present wide acceptance, and will have a bright future! The purpose of this particular article is to highlight the various research and job opportunities available to a new generation of students in chemistry, computer science and biology at various levels in both academic and pharmaceutical environment.
Software Tools: Skills Required
Commercial/Academic: ChemAxon Tools (Marvin Sketch, Marvin View, JChem), MOE (Molecular Operating environment), Schrodinger, Accelrys, Cambridgesoft (Perkinelmer), Spartan, Gaussian
OpenSource: CDK, JOElib, Weka, R, RapidMiner, Autodock, Vina, SVM,
Databases: Pubchem (Substances, Compounds, Bioassay), ChemBL, KiDB, FDA-Drugs, DrugBank, PDB, Pubmed, Sequence Data (Genome, proteome)
IDE: Netbeans, Eclipse, JCreator
Others: Matlab, SciLab,Mathematica,
MySQL, Oracle, PostgresSQL,
Galaxy, Taverna, Tavaxy, KNIME
Job Title of Recent Graduates
Graduates from the MSc in Chemoinformatics have taken up a variety of different types of posts upon starting employment. Examples of the job titles of recent graduates are given below: Chemoinformatics Scientist, Computational Chemist, Chemical Data Scientist, Regulatory Affairs Officer, Senior Information Analyst, Information Officer, Data Officer, Graduate IT Trainee, Programmer, QSAR Software Tester, Support Analyst, Business Analyst, Technical Editor, Consultant, Research Assistant Organizations/Companies of Recent Graduates etc., Graduates from the MSc in Chemoinformatics obtain posts with a wide range of organizations and companies.
Chemoinformatics activities at NCL - Pune (1995 - till date)
International Conference on Chemoinformatics
We are working in the area of chemoinformatics for the past two decades especially to develop tools for academic/industrial research. In this direction we made several predictive studies related to Drug Discovery Research (QSAR, QSPR and QSTR) . We applied QSPR strategy for predicting Melting point of diverse class of organic molecules . As part of innovative research we developed a methodology for molecular encoding as barcodes [click] with truly computable structures for inventory management. In order to handle high data we developed a program ChemXtreme to harvest chemical information from entire Internet using search engines like Google and extracted data such molecular properties, activities, and toxicity of molecules were converted in to specialized databases. ChemStar(5) is another program developed to handle large amount of molecular data using Distributed computing environment (Ref 4-5) and applied for calculating molecular properties for the entire collection of PubChem database. We also contributed in compiling MSDS datasheets for Central Pollution Control Board-New Delhi. Chemical Data mining of Indian Medicinal Plants and Traditional Chinese medicine from Scientific literature covering past four decades to build DoMINE (in progress).
What is new ?
List of Updated Publications (click) MILESTONES
Chemoinformatics for virtual screening and drug discovery Comb Chem High Throughput Screen. 2015 Jul 3.
Editorial (Thematic Issue: Design and Development of New Chemoinformatics Tools for Virtual Screening) , 18(6): 526 – 527 (2015) Muthukumarasamy Karthikeyan and Renu Vyas.
2. ChemScreener: A Distributed Computing Tool for Scaffold based Virtual Screening, 18(6): 544 – 561 (2015)Muthukumarasamy Karthikeyan, Deepak Pandit and Renu Vyas.
3. Prediction of Bioactive Compounds Using Computed NMR Chemical Shifts, 18(6): 562 – 576 (2015) Muthukumarasamy Karthikeyan, Pattuparambil Ramanpillai Rajamohanan and Renu Vyas.DOI: 10.2174/1386207318666150703113312
5. MegaMiner: A Tool for Lead Identification Through Text Mining Using Chemoinformatics Tools and Cloud Computing Environment, 18(6): 591 – 603 (2015) Muthukumarasamy Karthikeyan, Yogesh Pandit, Deepak Pandit and Renu Vyas.
6. Design and Development of ChemInfoCloud: An Integrated Cloud Enabled Platform for Virtual Screening, 18(6): 604 – 619 (2015) Muthukumarasamy Karthikeyan, Deepak Pandit, Arvind Bhavasar and Renu Vyas.
7. Pharmacophore and Docking Based Virtual Screening of Validated Mycobacterium tuberculosis Targets, 18(7): 624 – 637 (2015) Renu Vyas, Muthukumarasamy Karthikeyan, Ganesh Nainaru and Murugan Muthukrishnan.
8. Role of Chemical Reactivity and Transition State Modeling for Virtual Screening, 18(7): 638 – 657 (2015) Muthukumarasamy Karthikeyan, Renu Vyas, Sanjeev S. Tambe, Deepthi Radhamohan and Bhaskar D Kulkarni.
9. A Study of Applications of Machine Learning Based Classification Methods for Virtual Screening of Lead Molecules, 18(7): 658 – 672 (2015) Renu Vyas, Sanket Bapat, Esha Jain, Sanjeev S. Tambe, Muthukumarasamy Karthikeyan and Bhaskar D Kulkarni.
10. Chemoinformatics Approach for Building Molecular Networks from Marine Organisms, 18(7): 673 – 684 (2015) Muthukumarasamy Karthikeyan, Deepika Nimje, Rakhi Pahujani, Kushal Tyagi, Sanket Bapat, Renu Vyas and Krishna Pillai Padmakumar.
11. Muthukumarasamy Karthikeyan1 , Renu Vyas Chemical Structure Representations and Applications in Computational Toxicity Computational Toxicology : Volume I Methods in Molecular Biology (2012) Volume: 929 , 167-192 | DOI: 10.1007/978-1-62703-050-2_8 (URL)
12. Distributed Chemical Computing Using ChemStar: Open Source Java RMI Architecture applied to Large Scale Molecular Data from PubChem. (2008) J. Chem. Inf. Model., 48 (4), 691-703.
13. Harvesting Chemical Information from the Internet Using a Distributed Approach: ChemXtreme (2006) J. Chem. Inf. Model., 46 (2), 452 -46 1.
14. General Melting Point Prediction Based on a Diverse Compound Data Set and Artificial Neural Networks. (2005) J. Chem. Inf. Model.; 45(3) pp 581 - 590. (Update 22 may 2011: Dataset-2 12618 entries Range Model (unpublished))
15. Encoding and Decoding Graphical Chemical Structures as Two-Dimensional (PDF417) Barcodes M. (2005) J. Chem. Inf. Model.; 45(3) pp 572 - 580
16. Chemoinformatics A tool for modern drug discovery, (2002) Intl. J. Inf. Tech Mgmt. 1, (1), 69-82. [DOI: 10.1504/IJITM.2002.001188]
1 AUTOMATIC CLASSIFICATION OF BIOMOLECULAR INTERACTIONS BY ENERGY PROFILE FINGERPRINTING IN (Filing in progress)
2 A REMOTE COMPUTING ENVIRONMENT METHOD, SYSTEM AND APPARATUS FOR MOLECULAR INVESTIGATION BY EMAIL PLATFORM IN 3527/DEL/2015
3 METHOD FOR ENCODING LARGE SCALE MOLECULAR LIBRARY IN BARCODING FORMAT IN 1325/DEL/2015
4 NEW METHOD FOR THE SYNTHESIS OF (R)-PHENOXYBENZAMINE HYDROCHLORIDE EMPLOYING AZIRIDINE RING OPENING AS A KEY STEP IN 1844/DEL/2014
5 DEVELOPMENT OF NMR CHEMICAL SHIFT FINGERPRINTS AND APPLICATIONS IN 1874/DEL/2013
6 AUTOMATIC HARVESTING OF MOLECULAR INFORMATION RASTER GRAPHICS IN 2420/DEL/2011
7 AUTOMATIC HARVESTING OF MOLECULAR INFORMATION RASTER GRAPHICS US 14/241285
8 DEVELOPMENT OF NMR CHEMICAL SHIFT FINGERPRINTS AND APPLICATIONS WO PCT/IB2014/062585
9 AUTOMATIC HARVESTING OF MOLECULAR INFORMATION RASTER GRAPHICS WO PCT/IN2012/000567
10 I C B C ( INTERNAL COMPATIBLE BAR CODE GENERATOR) IN L-19372/2001
11 ICIS (Interactive Chemical Information System) IN 1996
To create awareness and promote CHEMOINFORMATICS among the academic community and other scientific funding agencies in India, we recently organized an International Conference on Chemoinformatics (Jan 22-24, 2007) at National Chemical Laboratory, first of its kind in the country [photos] [Read More..]. We also provide academic and industrial training in building chemical-biological databases related to chemoinformatics and other In-house built Databases and commercial databases and Molecular Informatics Tools.
Government Sponsored Projects:
Working in Molecular Informatics (Application of HPC tools using distributed and cloud computing architecture to handle large scale molecular data ~100 millions+ defining virtual chemical space of selected protein targets or therapeutic category, Organic reaction modeling using QM & QC and extension to biological systems (INSPIRE Project 12 FYP 2012-17), predictive QSAR (properties, toxicity, activity) , artificial neural networks and other machine learning tools, textmining, Visual computing for molecular informatics and application in drug design, lead optimization, materials, Education, Research and Management). Inventory and automation for sample tracking (NORMS 12 FYP 2012-17), chemical risk assessment and hazard analysis (Industrial Safety Processess modeling and simulations). Millions of docking being performed in a HPC enviroment to understand protein-ligand interactions by insilico studies (Figure). Practical Chemoinformatics (from Springer) highlights the power of programming computers for chemoinformatics applications.
Compute Molecular Descriptors using Moltable Portal! [Click]
International Conference on Chemoinformatics
International Conference on Chemoinformatics, 23-25 January 2007, National Chemical Laboratory, Pune
Relevant Publications: (Big Data Challenges in Chemoinformatics: 10 billion web pages, 80 million molecules, 40 million Docking, Chemically intelligent Digital Eye)
Muthukumarasamy Karthikeyan Ph.D,
For Complete Profile:[CV-PDF]
Address for Correspondance: (Download Visiting Card)
Muthukumarasamy Karthikeyan Ph.D, MBA
Principal Scientist, (CHEMOINFORMATICS)
Digital Information Resource Center (DIRC) & Centre of Excellence in Scientific Computing
[click to read PhD thesis (Organic Synthesis) Pune University ] [MBA: Marketing Generic Drugs in India (IGNOU)] && [Expertise: MSc Computer Science (FOSS) : ChemRobot: Visual Computing for Molecular Informatics (Anna University)]