Machine Learning for Health and Medicine{0}

By Brian L.

The increasing power of computers have allowed for certain industries to grow tremendously and for new industries to blossom. This increase in computing power has given rise to a method of predictive data analysis and model building called machine learning. Through the use of complex algorithms, computers equipped with machine learning software are able to learn from past experience and data in order to produce reliable results and decisions. In many instances this has the effect of saving precious time and resources, as well as the ability to analyze such large and complex data that would be impossible for humans to do on their own. The implication of such abilities has changed the way many industries and businesses look to the future. The scope of my focus for this post will be the effect machine learning has had within the health and medicine industry. I will examine the benefits, applications and potential shortcomings of machine learning.

Machine learning at its core can be closely associated with statistical analysis and data mining. I assert such a statement because much like the examples mentioned, in order for machine learning to guess and predict values, it needs to extract knowledge from data, and in many instances very large sets of data. There are two main types of ways information can be fed and interpreted by a device using machine learning. The first is called supervised learning, which is the technique in which you train the machine using data which is well defined and labeled. What this means is that the data is already tagged with the correct answer and outcome. The algorithm would then use past experiences from the information it has been fed in order to come up with an outcome. An example of this may be supplying data about the size and costs of houses in order to predict what price another house may sell for. The other type is called unsupervised learning, which is when the machine is trained using a data set that doesn’t have any labels. The learning algorithm is never told what the data represents. The aim of this type of learning is to supply the software with large sets of data so that the algorithm can begin to form a model about the particular item and potentially begin to recognize patterns. An example of this might be if one were to feed thousands of essays to a machine and then ask it to categorize the essays for similarities.

With the application of these various types of learning, we can now begin to look at what potential benefits the field of health and medicine can reap from machine learning. When applied correctly, machine learning is able to more accurately predict patient outcomes. Again, when the availability in data is high and rich in content, machine learning algorithms are able to draw conclusion from patterns that its finds to be consistent in order to draw some sort of a correlation between them. Also, the recognition of patterns in the patient histories can lead to cost savings by skipping expensive and unnecessary tests. Many time when Doctors are unsure of what is wrong with a patient they will advise administering test which may pose no help to the patient. With a machine learning algorithm, much of the objective nature of diagnosis and suggestion gives way to empirical evidence. Another possible benefit is the improved accuracy and objectivity of early detection, again made possible due to the predictive nature of the machine learning algorithms. Lastly, Doctors would be able to better prepare patients for end-of-life treatment if they were able to better predict how long a patient had to live, whether due to a degenerative disease or old age.

An example of a business application currently in the healthcare field is Health Catalyst who developed machine learning code called and is offering it to the public for free in an effort to expand collaboration and advance outcomes. This is a paradigm shift in the field of health and medicine because it has been relatively slow to adopt the new technology. The aim of Health Catalyst is to make their machine learning code to be relatively simple, so that those with only minimal experience with programming can make meaningful contributions. Their machine learning code is offered in two languages, R and Python. The capability of the code allows one to leverage data, possibly from a SQL database, in order to have data to model to predict, convert to meaningful analytics for analysis, etc. Health Catalyst is able to create predictive and pattern recognition models using healthcare organization’s own data. Another potential to this technology would be to allow smaller research firms or universities to have the power to create accurate models with healthcare data without having to hire large teams of scientists, effectively saving costs while maximizing their information output.

Another company on the forefront of machine learning in healthcare and medicine and IBM. With the use of their highly popular supercomputer Watson, IBM hopes to enable researchers and doctors to be able to analyze patients genomic data to highlight possible disease mechanisms. The analysis of human genes is a highly complex process which requires the use of complex data models, which is something which is well suited for a supercomputer that combines machine learning and analytical software, like Watson. What differentiates Watson from many other computers using machine learning software is that is has a Natural Language Processor, which enables it to analyze publications from scientific journals in order to collect and process large amounts of structured and unstructured data. The advancement of this field in particular could change the way analysis of scientific papers is done, but the technology is relatively still in its infancy. What IBM hopes to accomplish is give clinician the ability to create unique personalized care to patients with advanced cancer. They accomplish this by feeding Watson large sets of data from medical records and patients with similar diseases, and trying to find correlations and patterns which have helped patients in the past. Watson is also able to provide genetic analysis of tumors, using its Natural Language Processor, based on evidence extracted from medical literature to generate reports which can help a patient’s treatment. Watson is primarily written in Java, with some C++, as well as IUMA and Hadoop.

Though much has been said about the positive aspects of machine learning in health and medicine, that is not to say that no shortcoming exists. In fact, many challenges and limitations still face the industry. For instance. Machine learning for health and medicine has yet to take off as it has in many other industries like retail, because of the protections around healthcare data due to HIPPA. For those unfamiliar with HIPAA, it is an act passed by congress in 1996 which is an industry-wide standard for healthcare information on electronic billing and other processes. How this affects machine learning is that the requirement for large sets of data is stifled because of all the layers of protection and confidentiality enveloping individuals healthcare records. Also, by their very nature, machine learning algorithms are becoming great at predicting outcomes, but the aim of much of the research in health and medicine is finding root causes for illness and diseases, which is a bit difficult for machine learning to produce at the moment. Also, due to the high cost of converting technology and updating, many hospitals have been slow to adapt to new technology, and tend to do so only to comply with government mandated regulations instead of trying to stay ahead of the curve by embracing innovation.

Advancements in computing have given way to the proliferation of machine learning, enabling humans to have a level of prediction, data modeling and analysis, never before witnessed. Machine learning in healthcare and medicine has the potential to predict patient outcomes with better accuracy, better prepare individuals for end of life care and reduce costs by skipping the need for expensive and unnecessary tests. There are many companies paving the way in machine learning such as Health Catalysts which offer their open source machine learning api in an attempt to increase adoption by more individuals and groups. As well as large multinational corporations such as IBM, who through their supercomputer Watson have began to incorporate machine learning algorithms in order to help advance genomic testing and analysis. But the final diagnosis for machine learning is not one hundred percent healthy. THe industry has been slow to adopt the new technology, and the protection surrounding consumer records proves to be an obstacle to a processes which thrives on the consumption on vasts amount of data. However, machine learning does not need to be perfect or completely accurate in predicting outcomes, it just needs to be better than humans currently are, and it seems to be quickly approaching that benchmark.

Work Cited

Health Catalyst launches free open source machine learning and artificial intelligence tool. (2016, December 01). Retrieved March 02, 2017, from

Welcome to (n.d.). Retrieved March 02, 2017, from

Gaskell, A. (2016, November 30). How machine learning is supporting healthcare. Retrieved March 02, 2017, from

The Future of Big Data, Machine Learning, and Clinical Medicine. (2016, October 10). Retrieved March 02, 2017, from

IBM Watson Health – Genomics. (2016, September 26). Retrieved March 02, 2017, from