Chemoinformatics Chemoinformatics

Career in Chemoinformatics*

 M Karthikeyan*, National Chemical Laboratory, Pune, INDIA

email: m.karthikeyan [at] ncl.res.in  ; karthincl [at] gmail.com


RECENT Publications (click for details) 

 

 
* Available for Chemoinformatics Training, Seminar, and Hands on Workshops (Tool developments)
 
 
 
Are you still synthesizing a new compound?
Are you sure it is novel and not yet reported from 20+ million publications and 5+ million chemical patents?
Are you interested to publish / patent the molecule?
Are you aware of the physico-chemical, biological properties?
Can you search Chemical structures with the aid of Robots? or Search Engines?
What is the source / resource of chemical information Search? (Scifinder or Google?)
Can you Design Environmentally safe molecules, using Green Reaction route?
What is the next 'billion dollar' molecule waiting in the laboratory?
 

Recent Book : PRACTICAL CHEMOINFORMATICS (Click) [FORWARD and FRONT MATTER]

 
 

Text Book : Practical Chemoinformatics , Springer 2014 (ISBN: 978-81-322-1779-4) http://www.springer.com/chemistry/book/978-81-322-1779-4
Ch 1.   
Open-Source Tools, Techniques, and Data in Chemoinformatics . M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 Pages 1-92 . DOI 10.1007/978-81-322-1780-0_1
Ch 2.    Chemoinformatics Approach for the Design and Screening of focused virtual libraries M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 . Pages 93-131  DOI 10.1007/978-81-322-1780-0_2
Ch 3.    Machine Learning Methods in Chemoinformatics for Drug Discovery M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 133-194 DOI 10.1007/978-81-322-1780-0_3
Ch 4.    Docking and pharmacophore modeling for virtual screening M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 195-269 DOI 10.1007/978-81-322-1780-0_4
Ch 5.    Active site directed pose prediction programs for efficient filtering of molecules M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 271-316 DOI 10.1007/978-81-322-1780-0_5
Ch 6.    Representation, fingerprinting and modeling of chemical reactions. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 317-374 DOI 10.1007/978-81-322-1780-0_6
Ch 7.    Predictive methods for Organic Spectral data Simulation. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 . Pages 375-414 DOI 10.1007/978-81-322-1780-0_7
Ch 8.    Chemical Text mining for Lead Discovery. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 415-449. DOI 10.1007/978-81-322-1780-0_8
Ch 9.    Integration of Automated Work flow in Chemoinformatics for drug discovery. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 .Pages 451-499. DOI 10.1007/978-81-322-1780-0_9
Ch 10.    Cloud computing Infrastructure development for Chemoinformatics. M Karthikeyan and Renu Vyas. Practical Chemoinformatics © Springer India 2014 . Pages 501-528 .DOI 10.1007/978-81-322-1780-0_10

 
 
 
To address these important questions in Research, Education and Industries please read further!
 

When two scientific disciplines meet, they can be mutually beneficial, fill each other’s voids – and complement each other, giving rise to unprecedented scientific   opportunities. One such field of recent interest is chemoinformatics. Chemoinformatics plays a key role in areas as diverse as chemical genomics and drug discovery,   the storage of chemical information in databases and the prediction of toxic substances. Today, these techniques are mostly used in pharmaceutical companies in the   process of drug discovery, but also for example in “functional foods”, designed by nutritional companies to improve body functions, such as for example digestion or   brain function.

 

While bioinformatics is known since 1976 which is defined as “the study of informatics process in biotic systems”, the emerging terminology in the pharmaceutical   sector is commonly referred to as chemoinformatics, which is defined as the “mixing of information resources to transform data into information and information into   knowledge, intending for better rapid decisions in the arena of drug lead identification and optimization”. Chemoinformatics is a generic term that encompasses the design, creation, organization, storage, management, retrieval, analysis, dissemination, visualization and the   use of chemical information – so, virtually every area where “chemical data” is accessed or changed by means of computers.

 

Chemoinformatics represents a vital link   between experiment and theory in the area of drug design, through the extraction of information from data and conversion into knowledge. With the explosion of publicly   available genomic information, such as that resulting from the Human Genome Project, in the middle of the 1990s, bioinformatics has become very popular not only in the   scientific community but also among the general audience. This has led to the coining of the counterpart of bioinformatics in chemistry after about two decades as   Chemoinformatics. However this field can actually be seen as about two hundred years old – ever since the first account of chemical data has been published in   literature. Today’s technology in chemoinformatics in fact facilitates better organization, storage, retrieval and analysis of these data for further advanced predicting studies –   thus, saving time and money, also possibly animal experiments, and advancing humankind by developing novel, and safer, drugs.

 

The last three decades have seen   tremendous growth in this field with the advancement in the computer technologies. Today volumes and volumes of books has been written on this subject and even few   text books available for teaching in universities at the BSc and MSc level. Though there are full time Masters degree programs available in universities abroad, in   India this field has yet to get full recognition. Currently chemoinformatics is being introduced as part of an ongoing diploma or masters program in bioinformatics in spite of its maturity as a new discipline. Besides   the traditional mainstream areas of chemoinformatics such as :

  • database systems,
  • computer-assisted structure elucidation systems,
  • computer-assisted synthesis design   systems, and
  • quantitative structure-activity relationship (QSAR) etc

several new research areas of chemoinformatics have appeared recently, such as

  • in silico library   design,
  • virtual screening,
  • docking studies (to mimmic Protein-small molecules interactions in biological systems),
  • prediction of ADME (Absorption, distribution, metabolism and excretion) and toxicity.

It is interesting to notice that at the end   of 20th century almost all the major foundations and theories of chemistry had been well understood and established.

 

Chemistry has already evolved from largely a study   of the elements to a study of molecules to currently a study of molecular interactions, especially those involving biological macromolecules – the molecules such as   proteins and sugars we humans are made of. This offers a excellent opportunity for chemoinformatics to grow in this new direction. The main focus of recently identified “cyber enabled chemistry” by the US   National Science Foundation is on the development of integrated databases, data mining tools, molecular visualization and computational capabilities and the remote and   networked use of instrumentation.

 

The scope of this rapidly developing field will certainly continue to expand. It is worth mentioning that there is a new trend of   integration of chemoinformatics with bioinformatics. This is because many sectors of the chemical and pharmaceutical industries are interdisciplinary by nature, and   major progress and developments in those industries are occurring in both bioinformatics and chemoinformatics side by side. Chemists will become more and more computer   dependent, Internet dependent and chemoinformatics dependent.

 

Chemoinformatics through its development in the past half a century, has reached in the present wide   acceptance, and will have a bright future! The purpose of this particular article is to highlight the various research and job opportunities available to a new generation of students in chemistry, computer   science and biology at various levels in both academic and pharmaceutical environment.

 

Software Tools: Skills Required

Commercial/Academic: ChemAxon Tools (Marvin Sketch, Marvin View, JChem), MOE (Molecular Operating environment), Schrodinger, Accelrys, Cambridgesoft (Perkinelmer), Spartan, Gaussian

OpenSource: CDK, JOElib, Weka, R, RapidMiner, Autodock, Vina, SVM,

Databases: Pubchem (Substances, Compounds, Bioassay), ChemBL, KiDB, FDA-Drugs, DrugBank, PDB, Pubmed, Sequence Data (Genome, proteome)

IDE: Netbeans, Eclipse, JCreator

Others: Matlab, SciLab,Mathematica,

Java, C++,

MySQL, Oracle, PostgresSQL,

SVN, GIT,

Galaxy, Taverna, Tavaxy, KNIME

 

Job Title of Recent Graduates

 

Graduates from the MSc in Chemoinformatics have taken up a variety of different types of posts upon starting employment. Examples of the job titles of recent graduates   are given below: Chemoinformatics Scientist, Computational Chemist, Chemical Data Scientist, Regulatory Affairs Officer, Senior Information Analyst, Information   Officer, Data Officer, Graduate IT Trainee, Programmer, QSAR Software Tester, Support Analyst, Business Analyst, Technical Editor, Consultant, Research Assistant   Organizations/Companies of Recent Graduates etc., Graduates from the MSc in Chemoinformatics obtain posts with a wide range of organizations and companies.

 

 Chemoinformatics activities at NCL - Pune (1995 - till date)

 

International Conference on Chemoinformatics

  1.  International Conference on Chemoinformatics, 23-25 January 2007, National Chemical Laboratory, Pune

 

About Us

We are working in the area of chemoinformatics for the past two decades especially to develop tools for academic/industrial research. In this direction we made several predictive studies related to Drug Discovery Research (QSAR, QSPR and QSTR) . We applied QSPR strategy for predicting Melting point of diverse class of organic molecules  . As part of innovative research we developed a methodology for molecular encoding as barcodes [click]  with truly computable structures for inventory management. In order to handle high data we developed a program ChemXtreme to harvest chemical information  from entire Internet using search engines like Google and extracted data such molecular properties, activities, and toxicity of molecules were converted in to specialized databases. ChemStar(5) is another program developed to handle large amount of molecular data using Distributed computing environment (Ref 4-5) and applied for calculating molecular properties for the entire collection of PubChem database. We also contributed in compiling MSDS datasheets for Central Pollution Control Board-New Delhi. Chemical Data mining of Indian Medicinal Plants and Traditional Chinese medicine from Scientific literature covering past four decades to build DoMINE (in progress).

What is new ? 

 List of Updated Publications (click) MILESTONES

 

Issue-1: Chemoinformatics for virtual screening and drug discovery Comb Chem High Throughput Screen. 2015 Vol 18 (6)

Issue-2: Chemoinformatics for virtual screening and drug discovery Comb Chem High Throughput Screen. 2015 Vol 18 (7)


Chemoinformatics for virtual screening and drug discovery Comb Chem High Throughput Screen. 2015 Jul 3.

 

Editorial (Thematic Issue: Design and Development of New Chemoinformatics Tools for Virtual Screening) , 18(6): 526 – 527 (2015)  Muthukumarasamy Karthikeyan and Renu Vyas.
DOI: 10.2174/138620731806150902190608

 

1. Role of Open Source Tools and Resources in Virtual Screening for Drug Discovery, 18(6): 528 – 543 (2015)Muthukumarasamy Karthikeyan and Renu Vyas.
DOI: 10.2174/1386207318666150703111911


2. ChemScreener: A Distributed Computing Tool for Scaffold based Virtual Screening, 18(6): 544 – 561 (2015)Muthukumarasamy Karthikeyan, Deepak Pandit and Renu Vyas.
DOI: 10.2174/1386207318666150703112242

3. Prediction of Bioactive Compounds Using Computed NMR Chemical Shifts, 18(6): 562 – 576 (2015) Muthukumarasamy Karthikeyan, Pattuparambil Ramanpillai Rajamohanan and Renu Vyas.DOI: 10.2174/1386207318666150703113312

4. Protein Ligand Complex Guided Approach for Virtual Screening, 18(6): 577 – 590  (2015) Muthukumarasamy Karthikeyan, Deepak Pandit and Renu Vyas.
DOI: 10.2174/1386207318666150703112620

5. MegaMiner: A Tool for Lead Identification Through Text Mining Using Chemoinformatics Tools and Cloud Computing Environment, 18(6): 591 – 603  (2015) Muthukumarasamy Karthikeyan, Yogesh Pandit, Deepak Pandit and Renu Vyas.

DOI: 10.2174/1386207318666150703113525

6. Design and Development of ChemInfoCloud: An Integrated Cloud Enabled Platform for Virtual Screening, 18(6): 604 – 619  (2015) Muthukumarasamy Karthikeyan, Deepak Pandit, Arvind Bhavasar and Renu Vyas.

DOI: 10.2174/1386207318666150703113656

Editorial (Thematic Issue: Role of Data and Methods in Chemoinformatics for Virtual Screening), 18(7): 622 - 623 (2015)Muthukumarasamy Karthikeyan and Renu Vyas.
DOI: 10.2174/138620731807150903101821

7. Pharmacophore and Docking Based Virtual Screening of Validated Mycobacterium tuberculosis Targets, 18(7): 624 – 637  (2015) Renu Vyas, Muthukumarasamy Karthikeyan, Ganesh Nainaru and Murugan Muthukrishnan.

DOI: 10.2174/1386207318666150703112759

8. Role of Chemical Reactivity and Transition State Modeling for Virtual Screening, 18(7): 638 – 657  (2015) Muthukumarasamy Karthikeyan, Renu Vyas, Sanjeev S. Tambe, Deepthi Radhamohan and Bhaskar D Kulkarni.
DOI: 10.2174/1386207318666150703113135

9. A Study of Applications of Machine Learning Based Classification Methods for Virtual Screening of Lead Molecules, 18(7): 658 – 672  (2015) Renu Vyas, Sanket Bapat, Esha Jain, Sanjeev S. Tambe, Muthukumarasamy Karthikeyan and Bhaskar D Kulkarni.

DOI: 10.2174/1386207318666150703112447

10. Chemoinformatics Approach for Building Molecular Networks from Marine Organisms, 18(7): 673 – 684  (2015) Muthukumarasamy Karthikeyan, Deepika Nimje, Rakhi Pahujani, Kushal Tyagi, Sanket Bapat, Renu Vyas and Krishna Pillai Padmakumar.
DOI: 10.2174/1386207318666150703112950

11. Muthukumarasamy Karthikeyan1 , Renu Vyas Chemical Structure Representations and Applications in Computational Toxicity Computational Toxicology : Volume I Methods in Molecular Biology  (2012)   Volume: 929 , 167-192  |  DOI: 10.1007/978-1-62703-050-2_8 (URL)

12. Distributed Chemical Computing Using ChemStar: Open Source Java RMI Architecture applied to Large Scale Molecular Data from PubChem. (2008) J. Chem. Inf. Model., 48 (4), 691-703.

13. Harvesting Chemical Information from the Internet Using a Distributed Approach: ChemXtreme (2006) J. Chem. Inf. Model., 46 (2), 452 -46 1.

14. General Melting Point Prediction Based on a Diverse Compound Data Set and Artificial Neural Networks. (2005) J. Chem. Inf. Model.; 45(3) pp 581 - 590. (Update 22 may 2011: Dataset-2 12618 entries Range  Model (unpublished))

15. Encoding and Decoding Graphical Chemical Structures as Two-Dimensional (PDF417) Barcodes M. (2005) J. Chem. Inf. Model.; 45(3) pp 572 - 580

16. Chemoinformatics A tool for modern drug discovery, (2002) Intl. J. Inf. Tech Mgmt. 1, (1), 69-82. [DOI: 10.1504/IJITM.2002.001188]

 

  1. ChemRobot: Harvest Chemical Data from Images (LINK)

  2. Text Book : Practical Chemoinformatics ,


Patents (India)

1 AUTOMATIC CLASSIFICATION OF BIOMOLECULAR INTERACTIONS BY ENERGY PROFILE FINGERPRINTING  IN  (Filing in progress)
2 A REMOTE COMPUTING ENVIRONMENT METHOD, SYSTEM AND APPARATUS FOR MOLECULAR INVESTIGATION BY EMAIL PLATFORM  IN  3527/DEL/2015
3 METHOD FOR ENCODING LARGE SCALE MOLECULAR LIBRARY IN BARCODING FORMAT  IN  1325/DEL/2015
4 NEW METHOD FOR THE SYNTHESIS OF (R)-PHENOXYBENZAMINE HYDROCHLORIDE EMPLOYING AZIRIDINE RING OPENING AS A KEY STEP   IN  1844/DEL/2014
5 DEVELOPMENT OF NMR CHEMICAL SHIFT FINGERPRINTS AND APPLICATIONS  IN  1874/DEL/2013
6 AUTOMATIC HARVESTING OF MOLECULAR INFORMATION RASTER GRAPHICS  IN  2420/DEL/2011

Patents (International)

7 AUTOMATIC HARVESTING OF MOLECULAR INFORMATION RASTER GRAPHICS  US  14/241285
8 DEVELOPMENT OF NMR CHEMICAL SHIFT FINGERPRINTS AND APPLICATIONS  WO  PCT/IB2014/062585  
9 AUTOMATIC HARVESTING OF MOLECULAR INFORMATION RASTER GRAPHICS  WO  PCT/IN2012/000567

Copyright (Software)

10 I C B C ( INTERNAL COMPATIBLE BAR CODE GENERATOR)  IN  L-19372/2001
11 ICIS (Interactive Chemical Information System) IN 1996

To create awareness and promote CHEMOINFORMATICS among the academic community and other scientific funding agencies in India, we recently organized an International Conference on Chemoinformatics (Jan 22-24, 2007) at National Chemical Laboratory, first of its kind in the country [photos]  [Read More..]. We also provide academic and industrial training in building chemical-biological databases related to chemoinformatics and other In-house built Databases and commercial databases and Molecular Informatics Tools.

 

ChemInfoCloud (2012)  ChemRobot: Harvest Chemical Data from Images (LINK)

Government Sponsored Projects:

 Working in Molecular Informatics (Application of HPC tools using distributed and cloud computing architecture to handle large scale molecular data ~100 millions+ defining virtual chemical space of selected protein targets or therapeutic category, Organic reaction modeling using QM & QC and extension to biological systems (INSPIRE Project 12 FYP 2012-17), predictive QSAR (properties, toxicity, activity) , artificial neural networks and other machine learning tools, textmining, Visual computing for molecular informatics and application in drug design, lead optimization, materials, Education, Research and Management). Inventory and automation for sample tracking (NORMS 12 FYP 2012-17), chemical risk assessment and hazard analysis (Industrial Safety Processess modeling and simulations). Millions of docking being performed in a HPC enviroment to understand protein-ligand interactions by insilico studies (Figure). Practical Chemoinformatics (from Springer) highlights the power of programming computers for chemoinformatics applications.

 Compute Molecular Descriptors using Moltable Portal! [Click]

  International Conference on Chemoinformatics

  1.  International Conference on Chemoinformatics, 23-25 January 2007, National Chemical Laboratory, Pune

Relevant Publications: (Big Data Challenges in Chemoinformatics: 10 billion web pages, 80 million molecules, 40 million Docking, Chemically intelligent Digital Eye)
 

Muthukumarasamy Karthikeyan Ph.D,

DST BOYSCAST FELLOW & DBT OVERSEAS ASSOCIATE (Univ Northcarlina Chapel Hill, USA)

For Complete Profile:[CV-PDF]

[LinkedIn]

[ResearchGate]

[Official website/Resume]

Address for Correspondance:  (Download Visiting Card)

Muthukumarasamy Karthikeyan Ph.D, MBA

Principal Scientist, (CHEMOINFORMATICS)

Digital Information Resource Center (DIRC) & Centre of Excellence in Scientific Computing
CSIR-National Chemical Lab. Pune - 411 008, INDIA
Email: karthincl@gmail.com, m.karthikeyan@ncl.res.in
Ph: (O) +91-(0)-20 2590-2483+91-(0)-20 2590-2483 (M-F: 9.00AM-5.30PM IST) Mobile: +91-(0)-976-742-7981+91-(0)-976-742-7981 
URL:
http://moltable.ncl.res.in/

 [click to read PhD thesis (Organic Synthesis)  Pune University ]  [MBA: Marketing Generic Drugs in India (IGNOU)] &&  [Expertise: MSc Computer Science (FOSS) : ChemRobot: Visual Computing for Molecular Informatics (Anna University)]