ESR5 Reactivity for retrosynthesis steps by combining quantum mechanics and machine learn

Centre / Institution:
Pompeu Fabra University
Bioinformatics expertise:
machine learning, artificial intelligence, computer science, cheminformatics, bioinformatics, software engineering, software development.

Job description

Machine learning is changing our society, as exemplified by speech and image recognition applications. Also the life sciences change rapidly through the use of artificial intelligence, and it is expected that fields like drug development can take advantage of machine learning. The main goal of the AIDD project is to train and prepare the next generation of scientists who need to have skills in both machine learning and drug discovery and will, after graduating, be able to helping speeding up the drug development process.

The European Marie Skłodowska-Curie Innovative Training Network funds the AIDD project that brings together twelve academic partners (Helmholtz Zentrum München (coordinator), Germany; Aalto University, Finland; Freie Universität Berlin, Germany; Katholieke Universiteit Leuven, Belgium; Johannes Kepler Universität Linz, Austria; The Swiss AI Lab IDSIA, Switzerland; TU Dortmund, Germany; Universiteit Leiden, Netherlands; Université du Luxembourg, Luxembourg; University of Vienna, Austria; Universitat Pompeu Fabra, Spain and Vancouver Prostate Center, University of British Columbia, Canada) as well as four industrial partners (AstraZeneca, Sweden; Bayer Aktiengesellschaft, Germany; Janssen Pharmaceutica NV , Belgium and Enamine Limited Liability Company, Ukraine).

The AIDD network offers 15 PhD fellowships (referred under the programme as ESR, Early Stage Researcher position). The employed fellows will be supervised by academics who have strong technical expertise and have contributed to some of the fundamental AI algorithms which are used billions of times each day in the world, and by machine learning scientists working at pharmaceutical companies. The developed methods by the fellows will contribute to an integrated "One Chemistry" model that can predict outcomes ranging from different properties to molecule generation and synthesis. The network will offer comprehensive, structured training through a well-elaborated Curriculum, online courses, and six schools. Each fellow will perform research 1.5 years at an academic partner and 1.5 years at an industrial partner.

Description of the ESR5 position:

Accurate prediction of the outcomes of an organic reaction is still an unsolved task. Only experienced chemists can make reliable predictions based on underlying mechanistic and quantum chemical intuition. In this research project, the PhD candidate will develop a new methodology for outcome prediction based on fast quantum computation. The major expected outcomes are:

1. Selection of a set of relevant and simple cases of chemical synthesis steps in order to produce a database of accurate quantum mechanical data. The data will be generated using in collaboration with ESR13.
2. Training models and neural network potentials to be used for the prediction of the chemical reaction feasibility using the dataset. Validation of the models comparing the yield predicted and experimental values in collaboration with ESR4, ESR7, and ESR12.
3. Implementation of an interface to be used by other ESRs via Further validation on internal synthesis data at Bayer and expansion of the applicability domain to a larger set of synthesis routes.
Relevant References:

For further information please contact

The successful candidate will perform research at the following two partners:

1. Universitat Pompeu Fabra (academic Partner) is a public, international and research-intensive university that ranks 10th in the 2020 Times Higher Education Young University Ranking and the 1st of the Spanish Universities in the QS World University ranking 50 under 50 (position 28 worldwide and 7th in Europe). It is the 1st Spanish University (IUNE Report 2019) in yearly scientific output per professor, citations per professor, projects within the Spanish National R&D Plan (per 100 lecturers), in projects within the EU Framework Programme (per 100 lectures). It has about 12,000 enrolled students (nearly 1.300 of them in one of its 9 PhD programs), 1,500 teaching and research staff, and 700 administrative and service staff. The UPF is strategically located in the PRBB, one of the leading South European biomedical research hubs, that hosts 1300 international staff and scientists, core scientific facilities and multiple research institutes.

The student participating in this project will be enrolled in the PhD studies in BioMedicine and will join the Computational Science Laboratory (lead by Prof. Gianni De Fabritiis) whose interests are the application of computation to solve real world problems, defining intelligence as a form of computation. The research group develops machine learning models with intelligent, useful behavior using reinforcement learning and deep learning, for specific environments. Biomedicine is one environment where physics-based simulations and machine learning provide novel, innovative approaches. The group leads , one of the top distributed computing projects worldwide for running molecular simulations on GPUs and the open platform that has around a thousand registered scientists. The group and its spin-off company Acellera have collaborated with major industries worldwide like Sony, Nvidia, HTC mobile, UCB, Pfizer, Biogen and Novartis.

2. Bayer (industrial Partner) is a global enterprise with core competencies in the Life Science fields of healthcare and agriculture. Its products and services are designed to benefit people and improve their quality of life. At the same time, the Group aims to create value through innovation, growth and high earning power. Bayer is committed to the principles of sustainable development and to its social and ethical responsibilities as a corporate citizen. In fiscal 2018, the Group employed around 117,000 people and had sales of EUR 39.59 billion. Capital expenditures amounted to EUR 1.5 billion, R&D expenses to EUR 5.2 billion.

Desired skills and expertise

Essential Skills and Experience

● Master's degree in computer science, cheminformatics, bioinformatics or equivalent subject

● Courses in machine learning

● Courses in programming.

Desired skills:

● Experience of software engineering

● Proven experience of Python programming

● Experience of deep learning libraries for instance TensorFlow and/or PyTorch)

● Experience with libraries such as RDKit or scikit-learn would be of advantage

● Good command of modern software development tools, such as git

● Courses in drug development

The successful candidate will also demonstrate a passion for driving scientific questions with a positive and problem-solving attitude and the willingness to undertake challenging analysis tasks in a timely fashion. Excellent English is required, both spoken and written, and the ability to work effectively both independently and in cross-functional teams. We also believe that you enjoy teamwork, have a collaborative nature and will be an encouraging colleague to all.

Female researchers and candidates are particularly encouraged to apply.

Contract duration and other benefits

Project and Institution that finance the contract: This project is funded by the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 956832.

Official number reference: PREUR01121 - H2020-MSCA-ITN-2020-AIDD-956832-G.DeFabritiis.

Benefits of the opening: Marie Skłodowska-Curie funding offers competitive salaries.

Net salary is subject to country-specific deductions as well as depending on individual factors such as family allowance.

Required information and contact

Information on the application process:

1. prepare your profile and provide sufficient details about your educational background and work experience, proofs of your educational degree (or expected time until you obtain your degree), your CV, and a cover letter outlining your motivation for applying for the position;

2. submit your application to recruit(at) before the deadline of April 18th, 2021 (the screening will start immediately; do not wait until the deadline to submit your application). Indicate the ESR number in the title of the letter.

Further information is available on