Defining Technology Buzzwords: What Machine Learning, Data Lakes and Artificial Intelligence Means for You
Richard Lee, Senior Manager, Core Technology and Core Capability
Across all industries, technology buzzwords such as machine learning (ML), artificial intelligence (AI), big data, and data lakes cross our headlines daily. For scientists in the lab, it can be difficult to break through the hype and truly understand the problems that advanced technology is helping to solve, and how their workflows are affected.
Advanced technology has the potential to improve efficiency and experimental planning in the lab, once the data is in a useable form. While chemical R&D generates a deluge of data on a daily basis, data is often siloed in disparate systems. The heterogeneity of analytical data formats, for example, not only makes it difficult to manage the entire ecosystem of data, but also to analyze data that assists in critical decisions on a daily basis. Well-managed data not only helps to streamline daily workflows, it also enables companies to be better prepared for the use of AI, machine learning and big data systems. Furthermore, from our interactions with customers, we have found that the process of bringing data together and reviewing it en masse can flag issues and help improve processes.
In the lab of the future, we believe that scientists will still be responsible for the interpretation of their experimental data, but advanced tech will be used to guide experimental design to help scientists improve efficiency in the lab.
Here are a few buzzwords and the problems they can help solve in the lab of the future:
Machine learning is a focused and targeted use case of AI that uses statistical techniques to equip computer systems with the ability to "learn", or progressively improve performance on a specific task, (using existing data sets) with or without being explicitly programmed. In the lab of the future, by matching the power of human intelligence with intelligent technology, machine learning will be used to guide scientists through their journey. While scientists will still be responsible for executing experiments, machine learning can assist scientists in experimental designs, relying on historical data on both successful and non-successful reactions to guide them in the right direction. This will not only help to avoid pitfalls, it will also empower scientists to be more efficient in the lab.
Given our commitment to equipping scientists with the tools they need to maximize productivity and accelerate decision-making, ACD/Labs has been using machine learning in many of our software offerings for decades, including our NMR predictors and physiochemical property calculators and predictors.
Data lakes provide a single repository for all enterprise level data within an organization. Data can be stored in its natural state without any manual alteration or formatting in an unstructured manner. This enables enterprises to enter and analyze data efficiently, without having to abide by rigid reporting structures, making it easier to search and share data across the organization.
While the term “data lakes” has been used in the industry for many years, the technology has developed over time as products and applications have been created to help companies take advantage of data lakes. In addition to serving as the source data for machine learning and cloud computing, the industry continues to witness an increased use of data lakes within IT departments at biotech organizations due to the more flexible data structure for storing unformatted data.
AI is a generic term used to describe human programmed computer systems for generalized learning. It is a type of advanced technology that is built to mimic human behavioral skills, such as understanding, making decisions, and taking action. When applied in the context of R&D AI can be used by scientists, for example, to assist in successful experimentation. By ‘learning’ from the failures and successes of previous experiments, AI technology can use that information to guide the scientist towards experiments that are more likely to succeed, thereby removing failed experiments from a project timeline and helping it progress to the next stage with greater efficiency. AI may be used to predict the success of syntheses, separation methods, and many other experiments routinely used in R&D.
As many organizations suffer from siloed data formats and are unsure how to implement AI, establishing the foundational framework needed to assist with small scale projects can be a helpful first step in understanding if and how AI can be a useful tool to scientists in drawing meaningful insights from myriads of data.
While I believe it’s important that organizations be open to investigating and using new technologies as they emerge and use cases become more evident, it’s important to remember that human intuition will continue to play an important role. I don’t see technology taking over and replacing scientists. It may be a long time before technology can use human experiences to make decisions, if ever.