The development of medications capable of revolutionizing disease treatment is one of the great challenges of this century. The Galician group BioFarma has a powerful database of chemical compounds with vital information for the pharmaceutical industry. A new project initiated in 2012 brought about the expansion of the chemical library to 60,000 molecules. Owing to its vast storage capacity as well as security and protection for very sensitive data, CESGA is in charge of managing the servers that store these molecules.
The first evidence that human fingers left marks on clay tablets was detected in ancient Babylon and Persia. It was not until 1891 that it was really understood that fingerprints were unique in each individual and could be used as a key for identification. This finding was attributed to the Argentinian police officer Juan Vucetich. A year later, his team was the first to identify a murderer by means of dactyloscopy.
The information revealed by the curves of the skin on the fingers is very sensitive and must therefore be protected. This need for confidentiality is not limited to the scope of personal identification. As a result of the scientific and technological revolution begun in the 20th century and which continues today, many sectors generate and handle large volumes of information requiring management and security, now known as Big Data. Such is the case for the Galician research group BioFarma, with a database representing a treasure for the pharmaceutical industry.
The team led by Mabel Loza, Professor of Pharmacology at the University of Santiago (USC), has been providing information for the discovery of new medications since 1998. The aim of the more than thirty researchers at the Center for Research in Molecular Medicine and Chronic Diseases (CiMUS) is to identify the most promising chemical compounds for treating pathologies. “We create and store the fingerprint of each compound,” explains José Brea with a clarifying simile. Mr Brea is in charge of the group’s drug screening platform.
Detecting compounds that activate on cells is vital, for example, in the search for new chemotherapies to treat cancer. The information regarding how each compound reacts and its morphology is stored under lock and key as a sort of history. “For each compound we store its chemical structure, origin, test plate location and other findings”, specifies José Manuel Santamaría, the group manager.
Giant leap in capacity
With all this information, BioFarma has been putting together its own chemical library. At the beginning, when the group worked with samples of 15,000 molecules, managing the data did not pose a problem for the researchers’ daily routine. But since 2012, there has been a giant leap in the team’s activity. InnoPharma was born. This is a Galician initiative for the early discovery of drugs, which also includes the team led by USC geneticist Ángel Carracedo.
What was the challenge? The search for active compounds to develop new therapeutic targets in neurology, psychiatry, metabolism, cancer, inflammation and rare diseases. The volume of the chemical library shot up, as the group was able to gather 60,000 molecules.
With this scenario, BioFarma needed more capacity to deal with the volume of compound analyses and store the vast amount of information safely. As Santamaría recalls, when the screening platform grew, the group found the solution to house its huge volumes of data in the servers at the Supercomputing Center of Galicia (CESGA).
Heavy Image files
BioFarma’s activity took on another dimension. CESGA contributed its experience in calculation, high performance computing and advanced services to the startup of InnoPharma. The increased management capacity represents a competitive advantage for the group, the Spanish leader among public institutions in the analysis of active compounds for new drugs.
But how did work adapt in the group? Brea mentions two types of data management using two servers at the Supercomputing Center. One of them hosts the database where the group stores all the information on each compound. The other server is used for automated analysis and storage of microscopy images.
“Using a microscope we observe cells that have marked cellular structures with fluorescent wavelengths and up to five signals per image. What is measured is the effect of chemical compounds on these cells”, says Santamaría. In each assay, the images of 384 experimental points are processed. Depending on its complexity, an assay can require up to ten gigabytes of memory. “The analysis software requires a lot of memory to quantify each of the signals and assign a numerical value to it,” he adds.
Safe data storage
CESGA’s technological support translates not only into capacity, but also into security and protection. “This data is very sensitive and a secure environment without risk of intrusion is indispensable”, as the group’s representatives explain. Putting the operation of their servers in the hands of the supercomputing center means that they will not be affected by current fluctuations or other types of failure that could paralyze their work and cause damage.
“For companies that require them, our services are like a quality certificate. Everything we handle is shielded from third party activities, “said Santamaría.
With this initiative, BioFarma is able to reduce the gap between basic research into new therapeutic mechanisms and their industrial application. It is a question of pharmaceutical companies taking advantage of scientists’ know-how to put their efforts into developing new drugs. The chemistry library offers them sophisticated basic information to outline the power and potential success of a new medication.
It is very likely that images analyzed over the last 5 years at CiMUS become the pillar for drugs to treat stroke, lung cancer or asthma in the future. The research and technology that will make it possible are 100% Galician.