Overview of Health Care Data Analytics

welcome to component 24 healthcare and data analytics unit 1 introduction to healthcare data analytics this is lecture a this unit introduces the basics of working with healthcare data for the novice the different types of data are explored as well as the array of technology and tools available for working with data big data is defined and the special challenges related to working with data are discussed the objectives for this unit introduction to healthcare data analytics lecture a are to give a basic overview of data analytics and healthcare and describe the nine steps of the data analytics process in 2011 Peter Sondergaard senior vice president and global head of research for the worldwide information technology research and advisory company Gartner stated that information is the oil of the 21st century and analytics is the combustion engine so what exactly is analytics and why is it so important to 21st century healthcare the Institute of Medicine in their 2012 report titled best care at lower cost the path to continuously learning healthcare in America stated that Americans healthcare system has become far too complex and costly to continue business as usual pervasive inefficiencies and inability to manage a rapidly deepening clinical knowledge base and a reward system poorly focused on key patient needs all hinder improvements in the safety and quality of care and threaten the nation's economic stability and global competitiveness achieving higher quality care at lower cost will require fundamental commitments to the incentives culture and leadership that foster content newest learning as the lessons from research and each care experience are systematically captured assessed and translated into reliable care they define a learning healthcare system as a system designed to generate and apply the best evidence for the collaborative health care choices of each patient and provider to drive the process of discovery as a natural outgrowth of patient care and to ensure innovation quality safety and value in healthcare consider the various information systems you've learned about so far a hospital will likely have an electronic health record system as well as specialized departmental systems for laboratory diagnostic imaging pharmacy nutrition services billing anatomic pathology and so on each of these systems is designed and intended for clinical use in other words patient care and so they capture specific data about the patient however none of these systems has a complete set of data for any individual patient or for a group of patients such as all patients who were admitted in January with a certain diagnosis that can be used for analysis and reporting obtaining deep insight into what is happening with individual patients as well as across groups of patients requires aggregating data together from many systems obtaining deep insight into what is happening with individual patients as well as across groups of patients requires aggregating data together from many systems and performing statistical analyses of this aggregated data in contrast to the various clinical systems discussed on the previous slide a clinical data warehouse brings together data for a patient into a single coordinated location and this location is used for analysis reporting purposes this is accomplished via process known as extraction transform load or ETL which retrieves data from various clinical systems synchronizes formats of data in a process called transformation and cleans up the data and then imports the data into the database of the clinical data warehouse the transformation process is especially important as data can be stored in a variety of forms across systems for example a laboratory system might use the letters M F or u for patient gender male female or unknown while the radiology information system might use one two or nine instead however they must match the designations used in the clinical data warehouse and that process of converting them to match is called transformation another important step is ensuring that all of a patient's records from various systems are linked together this typically requires a master patient index sometimes called a master person index to link a patient's various identifiers across systems now that you have an understanding of the need for a centralized coordinated location for patient data that can be used for analysis and reporting we'll define the term analytics and explore the different types of analytics what is analytics isn't it the same thing as statistics the term analytics has been used in a variety of ways and with different meanings in fact Gartner stated that analytics has emerged as a catch-all term for a variety of different business intelligence BI and application related initiatives in 2015 the National Institute of Standards issued a formal definition of analytics as follows the term analytics refers to the discovery of meaning for patterns and data and is one of the steps in the data lifecycle of collection of raw data preparation of information analysis of patterns to synthesize knowledge and action to produce value as shown in this diagram analytics is the entire process of data collection extraction transformation analysis interpretation and reporting it includes statistical analysis as one of the steps further the NIST stated that analytics is used to refer to the methods there are implementations and tools and the results of the use of the tools as interpreted by the practitioner the analytics process is the synthesis of knowledge from information IBM in 2013 categorized analytics into three types descriptive uses business intelligence and data mining to ask what has happened predictive uses statistical models and forecast to ask what could happen prescriptive uses optimization and simulation to ask what should we do to these three types Gartner adds a fourth type of diagnostic analytics which they define as a form of advanced analytics which examines data or content to answer the question why did it happen as shown in this diagram the simplest type of analytics starts in the lower left hand corner with descriptive analytics diagnostic analytics are more valuable to the institution but also more difficult to perform even more difficult and also more valuable our predictive analytics finally the most difficult and also the most valuable are prescriptive analytics let's look at each of these now descriptive analytics are the simplest type of analytic and simply describe the data Commons statistics are used such as the number of laboratory tests the average age of patients or the average length of stay in the hospital for patients with a particular diagnosis descriptive analytics are often presented as pie charts bar or column charts tables or written narratives Gartner defines diagnostic analytics as a form of advanced analytics which examines data or content to answer the question why did it happen tools used for diagnostic analytics include drill down techniques data discovery and correlations let's start with an example before going into the formal definitions Kaiser Permanente analyzed data on infants to develop an algorithm for classifying which babies were at risk for developing sepsis and conversely which babies did not need to be treated sepsis is described by the Mayo Clinic as a potentially life-threatening complication of an infection substance occurs when chemicals released into the bloodstream to fight the infection trigger inflammatory responses throughout the body this inflammation can trigger a cascade of changes that can damage multiple organ systems causing them to fail if sepsis progresses to septic shock blood pressure drops dramatically which may lead to death kaiser permanente stated that judicious application of our scheme could result in decreased antibiotic treatment in eighty thousand to two hundred and forty thousand US newborns each year with that example in mind let's now look at a definition of predictive analytics and how the Kaiser Permanente case is an example of predictive analytics Gartner states that predictive analytics has the following four attributes first an emphasis on prediction rather than description classifying or clustering in the Kaiser Permanente example they were trying to predict which newborns were at risk of developing a life-threatening condition so that they could treat the babies to prevent it the second attribute defined by Gartner is rapid analysis often in hours or days consider again the sepsis example sepsis is a rapidly progressing condition that if it progresses to the most severe stage of septic shock can have a 50% mortality rate therefore analysis of the data to predict which infants are at risk of developing this condition must be done rapidly not over a period of weeks or months the third attribute defined by Gartner is an emphasis on the business relevance of the resulting insights consider the word relevance and how that would apply to the example of infants with a life-threatening infection information that would directly affect the care and prevent infants from dying is relevant and finally the fourth attribute defined by Gartner is an emphasis on ease of use thus making the tools accessible to business users in other words these tools should be available to the clinical staff to use however it is important to note that as Michael Wu of lithium states the purpose of predictive analytics is not to tell you what will happen in the future it cannot do that in fact no analytics can do that predictive analytics can only forecast what might happen in the future because all predictive analytics are probabilistic in nature this brings us then to the highest level of analytics which is prescriptive analytics Gartner defines prescriptive analytics as a form of advanced analytics which examines data or content to answer the question what should be done or what can we do to make something happen and is characterized by techniques such as graph analysis simulation complex event processing neural networks recommendation engines heuristics and machine learning now let's look at the steps and data analysis in more detail data analytics involves a sequence of steps number one identify the problem number two identify what data are needed and where those data are located number three develop a plan for analysis and a plan for retrieval number four extract the data number five check clean and prepare the data for analysis number six analyze and interpret the data number seven visualize the data number eight disseminate the new knowledge and number nine implement the knowledge into the organization we will go into each of these in more detail on the next slides the first step is to define the problem to be studied or in business terms identify the business case why is this important to study how will the result impact patient care or the institution you must have a clearly stated problem or question to guide the rest of the process you also need to identify any stakeholders people who have a direct interest in this problem and who need to receive the results of the analysis at the end of the process next the data needed for the analysis need to be identified where are the data elements located in what system or systems and what database tables who is the contact person for each system who will be responsible for retrieving the data is there a clinical data warehouse if not the required data elements may be stored in different systems requiring multiple extraction steps a plan for retrieving the data from the various systems along with a plan for checking that all the data required were actually retrieved should be developed there needs to be some way to determine how many records are expected and then actually retrieved this may involve cross-checking against other systems this step will require the participation of the individuals who normally perform data retrieval from the system's involved an analysis plan needs to be developed a statistician should be consulted and questions to be addressed here include what is the population what size does the sample need to be what statistical tests should be performed the next step is the actual extraction of the data from the system or systems involved after the data are retrieved the data needs to be checked for completeness is the set of data complete where all the records that should be retrieved actually retrieved at a minimum descriptive statistics such as counts must be performed at this step at this point changes to the extraction plan may be needed and another extraction from the source systems may need to take place once a complete set of records is extracted from the source systems errors in the record need to be identified and corrected and all data have errors such as transpose letters and names and incorrect values decisions must be made about how to handle empty field next data must also be synchronized or transformed for example patient gender in one system in the hospital may be stored as mfu while another system might use one two nine one set of values must be changed so that all the records are using the same values after all necessary transformation steps have been completed the data are then imported into the destination system where the actual data analysis and reporting will take place this may be a system as complex as a clinical data warehouse or as simple as a desktop computer the data are now in the system where the analysis will be run and it should be a complete set of data you need to check that everything is ready for analysis did you get what you needed check and verify this against the analysis plan that was developed in step 3 and that you have everything to address the problem that was identified in step 1 now you are ready to do the actual analysis to execute the analysis plan that was developed earlier perform the statistical analyses and enlists the assistance of the statistician to confirm the interpretations and conclusions of your analysis now you need to be able to communicate the results of your analysis and how the results address the problem from step 1 this communication must be very clear and rapidly understandable to the decision-makers in the institution so selecting an appropriate representation for your findings is essential choose a visualization that is appropriate for the type of data for example categorical data can be represented with column or bar charts tables and pivot tables while quantitative data can be shown with histograms and a wide variety of other types of graphics such as scatter plots and Starr plots some common tools are tableau and Microsoft Excel chart function once the analysis interpretation and any visualizations are complete a report must be developed it might be a formal written document an email or a presentation regardless of the delivery method the report needs to clearly state the original problem the process that was used to address the problem and then the results of the analysis along with the supporting visualization this represents new knowledge and needs to be distributed to the stakeholders that were identified in step one finally the new knowledge needs to be implemented to address the original problem this will require the participation of the stakeholders for more information on these topics read the articles six steps of an analytics project by jaideep ken juga the seventy steps of data analysis by Gwen Shapira article URLs are mentioned in the rough slide at the end of this presentation this concludes lecture a of component 24 healthcare and data analytics unit 1 introduction to healthcare data analytics to summarize analytics is the entire process of data collection extraction transformation analysis interpretation and reporting it can be categorized into three types descriptive predictive and prescriptive you


Leave a Reply

(*) Required, Your email will not be published