Over a year ago, Dr. David Ross mentioned the NYC Macroscope project to Piper and me, and I’ve wanted to record a podcast episode about it ever since. I think NYC Macroscope caught my attention because it leverages data that already exist in electronic health records (EHRs) for population health surveillance. For many chronic conditions, survey data currently serve as the “gold standard” for public health surveillance. Surveys can be incredibly resource intensive, do not always provide representation across subgroups in a population, and there are sometimes long lag times between when data are collected and when the results are published. The idea of EHR data supplementing these surveys intrigued me.

At the same time, I had a lot of questions. How would health care share data with public health? How reliable are the data? How generalizable are they to the population as a whole?

I’m very grateful Sharon Perlman, the Director of Special Projects in the Division of Epidemiology at the New York City Department of Health and Mental Hygiene, agreed to be a podcast guest. In this episode, Sharon discusses how the NYC Macroscope got its start, the strengths and limitations of EHR-based population health surveillance, and some of the future directions for this work in New York City.

If you’re interested in learning more about NYC Macroscope, check out the series of papers Sharon and her team published in the December 2016 issue of eGEMS.

If you haven’t subscribed to Inform Me, Informatics, you can do so on iTunes or Soundcloud. We’re also now on Google Play and Player.fm. Like the podcast? Please consider rating us on iTunes! This will help other listeners find out about us.

The NYC Macroscope team

JESSICA

Hey. Welcome to another episode of “Inform Me, Informatics!”. I’m Jessica Hill. Today’s episode is all about data from electronic health records or EHRs and how these data can be used in public health practice.

So, let’s start here. Imagine you go to a doctor’s visit. The provider probably measured basic health information like your height, weight and blood pressure. Maybe they entered that information directly into an electronic health record. This information can be incredibly important to your individual care. For example, maybe you’re trying to manage high blood pressure. So clearly you and your provider will want to check your blood pressure at each appointment and then be able to track it over time.

But now think about all the patients that go to that same doctor’s office. All of their information is also in the EHR. It’s sitting there waiting for the next time a particular patient has an appointment. In addition to using the EHR data to treat individual patients, a doctor might want to look across their patients to better understand if certain conditions come up a lot. And then what if we could combine all the EHR data from many different doctors’ offices in a whole ZIP code, maybe in a whole city to get an even bigger picture of the health of that population? All the individual information could be deidentified so we’re no longer focused on treating individuals and then it would be like many, many puzzle pieces coming together to give an overall look at that population.

The New York City Macroscope project explores this very idea. NYC Macroscope is an EHR based chronic disease surveillance system. The team actually just published their series of articles in the journal iGEMs and one of those articles asks, “Can electronic health records be used for population health surveillance?” There are a lot of important pieces to this question. How does healthcare share the data with public health? How reliable are the data? How generalizable are they to the population as a whole? I recently spoke with Sharon Perlman, the director of special projects in the division of Epidemiology at the New York City Department of Health and Mental Hygiene. We spoke over Skype so as you’ll hear…well, it sounds like a conversation we recorded over Skype but it’s a really interesting phone conversation. We talked about NYC Macroscope and how it got its start, the strengths and limitations of EHR based surveillance and some of the future directions for this work in New York City.

SHARON

So, the Macroscope is an electronic health record surveillance system and what that means is that we’re using data from providers that have electronic health records. So, we have a network of electronic health records throughout the city which is part of the primary care information project and the Macroscope is able to aggregate data from all those electronic health records and may come up with an aggregate estimate for all those providers and can say something about the population as a whole.

JESSICA

So, I was wondering if you could give us a little background on this project. How did Macroscope come to be?

SHARON

Well, in about 2011 or even before that, we were thinking about doing a second NYC HANES, the Health and Nutrition Examination Survey which is based on the national HANES survey and is a citywide representative survey where we go into people’s homes and we ask them lots of questions and with their permission, we take blood and urine and can do tests of different diseases and we can measure blood pressure. So, we were looking to do a second one. We had done the first one in 2004. And then at the same time, our commissioner Dr. Farley was thinking about using electronic health records for public health surveillance and he came up with this great idea of using the second NYC HANES to validate health estimates from our EHR system.

JESSICA

What problem is the Macroscope trying to solve?

SHARON

Well, we were interested initially in whether electronic health record could be used for population health estimates. As you know, the data are collected primarily for clinical purposes and hadn’t really been used, you know, nationwide for population health surveillance before. So, we were interested to see how representative those data were and whether they could tell us something about the city as a whole.

So here at the Health Department we have many sources of population health data. We have surveys. We have births and death records. We have access to hospitalization records. We have reportable diseases. So, we were interested in seeing how the EHR data would complement that and, you know, potentially replace some of our survey data. There’s just such an amazing opportunity with electronic health record data because they cover so many people and, you know, if the network is in place, then it’s a relatively easy way to access the data. It can be done in real-time and at relatively low cost.

JESSICA

Great. So, you mentioned some of the partners involved already but I was wondering if you could kinda paint the picture for all the different groups that were involved.

SHARON

So, there were many people here involved at the Health Department. The electronic health record data is from a bureau at the Health Department called the Primary Care Information Project or PCIP and they’ve established this large network of outpatient providers that covers about two million patients in New York City, about a quarter of the population. And, you know, and establishing that network, they certainly had relationships with providers and hospitals and community health organizations. And so, we were very lucky that that was in place and we could draw from that data. So PCIP worked closely with the clinical works [inaudible 00:06:23] to develop the EHR system initially. And they helped us to incorporate public health measures. We worked closely with the city university, New York School of Public Health. And then now some of our partners [inaudible 00:06:35] have moved to NYU so we’re working with NYU. And then our funders were really critical. So, the Belmont Foundation, the Robert Wood Johnson Foundation, the New York State Health Foundation, Robin Hood and Doris Duke provided critical funding to get this off the ground.

JESSICA

That PCIP, was that already in place before this project?

SHARON

Yeah. We were lucky that that was already in place. That had been established around 2007 and the original thought behind PCIP was to try and get electronic health record systems into underserved areas and the Health Department subsidized the cost and provided technical assistance to outpatient providers in those areas.

JESSICA

So, for the electronic health record data, how does that work? Like, how do you…you say like, “Okay, these data, you know, exist.” How do you then access them and ask this question of whether or not they’re comparable to the gold standard surveys?

SHARON

Yeah, that’s a good question. So, we have our colleagues in PCIP who are very familiar with the data and they’re able to query their providers who are a part of the network and the way it’s set up, the providers have signed an agreement allowing access to their data and the health department can ask for, you know, a specific query of, like, how many women, 20 to 39 have diabetes. And then each practice or provider will return an aggregate count of the number of patients that meet those criteria. So, there’s never any identifiable individual level of data that’s returned. It’s just aggregate and there’s no identifiable information. So that solved some of the privacy issues. You know, our colleagues at PCIP can write these queries that they send out to each practice. They run overnight typically. Sometimes they run several nights to make sure that they get a certain number of practices because some practices may be…you know, the computer is turned off for the night or some glitch why it doesn’t work. And when they run the queries, they get, you know, a provider level data. And then we aggregate that data and we rate it.

JESSICA

Am I understanding correctly if it’s kind of similar in a survey in something like New York City HANES, you have a pool of people who are your study sample? And so, you kind of ask a set of questions among that group of people. So, for the electronic health records, the group or the pool of people are individuals who are seeing providers within the PCIP. And so, you’re just asking the same questions but instead of asking individuals directly, you’re querying their electronic health records? Is that accurate?

SHARON

That’s right. Generally, can EHRs be used for population and healthcare [inaudible 00:09:22]? I think the answer is yes, they can be used for certain indicators and as we’ve seen from our work, some indicators perform very well and others don’t at all. So, you know, I think a lesson learned here is that for some things, other jurisdictions might be able to adopt the certain indicators like hypertension or diabetes. Some indicators don’t perform very well at all. And, you know, they might improve in the future if documentation improves or they may not. But I think the overall answer is yes.

JESSICA

So that’s actually really interesting to think about how some indicators or important kind of data points that we may want to know about a population, some of them are, you know…performed well through surveillance with EHRs and some don’t. What are some indicators that don’t do well and what are some reasons why that might be?

SHARON

Yeah, so two that we looked at that didn’t do as well were depression and influenza vaccination. So, when we looked at the Macroscope estimate for depression, it was about 8% and when we compared it with NYC HANES which was about 15%…so it was about half of what we would expect it to be. And we suspect that’s because providers don’t routinely screen for depression. We use the PH29 which is the patient health questionnaire that was the screening tool that we used to compare the estimates in old sources. And when we looked at the Macroscope data, we found that only 33% of records had information about depression status. So, the data were showing us that providers were in fact not routinely asking about depression.

And then with influenza vaccination it was kind of similar. We found that the Macroscope estimate was about half or a little less than half of what we saw in HANES. And we think that’s because people frequently get flu vaccination at drug stores or in the workplace or other places where it doesn’t make it back into the EHR record.

JESSICA

Why would something like hypertension perform better?

SHARON

When someone goes to see a provider, almost always the provider takes blood pressure. I mean, that’s very standard. So, we have very high rates of data completion for that indicator. And we found that it performed very well compared with NYC HANES. You know, it’s similar to the survey where the person goes into the home and takes blood pressure in the same way.

JESSICA

So, what do these findings mean for the future of EHR surveillance efforts?

SHARON

It’ll be interesting to see how some indicators change over time especially as meaningful use is implemented further and there becomes more standardization of how data are collected and perhaps more completeness in how different diseases and conditions are documented. And, you know, I think it’s a really exciting time for the field now. There are increasing numbers of jurisdictions who are using EHR for population health surveillance and, you know, it’s just been really exciting for our group to talk to some other jurisdictions that are working on this and to learn from them and share what we’ve learned.

JESSICA

Well, what are the implications for other jurisdictions, particularly local health departments that are interested in local level data?

SHARON

You know, I think one of the really exciting potentials of EHR data would be small area estimates. So, you know, because there are so many records in most EHR systems, it allows you to look at smaller areas like neighborhoods or potentially ZIP codes to get better information than we currently have, you know, in New York City and in other areas. And then for some jurisdictions which really don’t have much local data, you know, it’s a really valuable source of data. So, they might get some information that they never had before.

JESSICA

Would you mind kind of explaining why that’s a benefit from EHR data that maybe a survey doesn’t provide? Like, why would EHR data help you with that ZIP code level more than a survey might?

SHARON

Yeah, that’s a good question. So, with surveys, they’re often very labor intensive and costly to conduct. And because of that, we typically have sample sizes of maybe 1,000 to 2,000 people. And because people are randomly selected through the city, we are able to make generalizations about the city as a whole and sometimes smaller area burrows or we have United Hospital Fund areas which are kind of aggregates of different ZIP codes. But because the sample size isn’t that large, if we try and get ZIP code level estimates, we often have very few people in that ZIP code so it’s…we can’t get a reliable estimate. But with EHRs we can have, you know, maybe 10,000 people in a ZIP code. So that allows us to say much more about that population.

JESSICA

Are there ever concerns around the differences between groups in the population that are going to a primary care provider and accessing care versus, like, how generalizable they are to the larger city population?

SHARON

Yeah, definitely. And thank you for asking that question. So initially when we were doing this, we were thinking, you know, we might be able to say something about New York City as a whole but then we did an analysis of the population who are in care compared to the population that’s not in care. So, people who are not seeing providers. And then we saw that there really were substantial differences. So, when we did this, we limited it to the population that had seen a provider, a primary care provider in the past year. And when we got our survey estimates from HANES and from the community health survey, we limited those data to people who had seen a provider in the past year so it would be somewhat comparable. But I think that is an important limitation that these estimates are mainly generalizable to the population in care.

JESSICA

Do you see using EHR data for population health surveillance, do you see that as replacing some of the gold standards we think of like HANES or in the case of New York City, NYC HANES? Is that the vision that it will replace it or somehow complement? Like, what are your thoughts around that?

SHARON

Yeah, no. I don’t see it as replacing survey data. I think survey data is really critical and what we’ve learned from this project is that there are certain measures which we can really only get from survey data and it also provides really valuable information for the population that’s not in care and we don’t know anything about that from the EHR system. So, I think EHR data is very valuable and can complement survey data but will not replace it.

JESSICA

But are there things that you think that EHR data can kind of unlock or somehow illustrate with more detail given just that these data exist and if public health has access to them, they may be able to answer some different questions?

SHARON

You know, we started with this group of about a dozen indicators but certainly we are very interested in other indicators and other groups have explored indicators which we really have no information about currently. Like, there’s a group in Massachusetts that’s looked at Lou Gehrig’s Disease and estimates from EHRs for that and compared it with an ALS registry that they have there. So that, you know, that’s really fascinating. If we were able at some point to get an estimate for how many people in New York City might have Lou Gehrig’s Disease or epilepsy or Alzheimer’s or other things that we don’t currently collect with our survey. With any data source, it’s important to understand what it means and to use it responsibly.

JESSICA

What does use it responsibly mean?

SHARON

Well, you know, I think with electronic health records, there’s a lot of excitement about the data and how it can be used which is great. But, you know, it’s important to take the time to validate it if it needs to be validated, to understand what the population is, who’s in that EHR system and how it compares to the general population that they’re trying to study. And, you know, when writing the queries, thinking about how those are written and what providers are going to be included. Like, we had this super cohort with a minimum documentation standard. So that enabled us to get better quality data. So, all those things, you know, just thinking carefully about what the data represented and how to extract it.

JESSICA

What are some next steps for the Macroscope itself? Now that you’ve, you know, published these three papers, is it, like, game over or are things going to continue?

SHARON

Yeah, that’s it. We’re done.

JESSICA

Yeah.

SHARON

No. We are very interested in small area estimates and we’ve started some work on that. We’re interested in looking at our data more by race. Initially we had stratified by age, sex and neighborhood income. So, we are interested in going back and running our data by race and seeing how reliable those estimates are. And then the other thing we’re thinking about now is looking at trends over time and how those trends compare with our community health survey.

JESSICA

And how might that information be useful to the Department of Health?

SHARON

It would be great information for us. I mean, we’re very interested in health inequities and in a lot of the work we do, we analyze our data by different racial, ethnic groups and thinking about programs and interventions that we can design to address those and monitoring those over time to see if they’re getting better or worse. And, you know, with EHR data, as I mentioned before, there is the ability to look at smaller areas and to look at racial subgroups. And then with some of our data sources, we are able to look at some subgroups. But often we’re limited by our sample size, not having enough people to get reliable estimates. But with the EHRs we really have a huge sample so we might be able to look at some of those subgroups more reliably.

JESSICA

And does that information then flow into the programing arms of the department or…like, how would that information be used?

SHARON

Yeah, definitely. So our program views our data all the time to, you know, think about what programing is necessary, to evaluate what they’re doing. And data in general is just really important in terms of advocacy, sharing it with our partners throughout New York City who are interested in data like community health organizations, the public, politicians and others who are interested in the health of the city overall or in health of other…their neighborhoods if it’s a politician that represents their neighborhood. So, to let people know what the health issues are and to advocate, to ask for resources to do what we think needs to be done.

JESSICA

You talking about kind of connecting with politicians. Really then that ZIP code specific data really seems like it would be very valuable because it’s not just, you know, in the city as a whole but particularly in the neighborhoods that you represent, this is what we’re seeing.

SHARON

Exactly. Exactly. And, you know, for community health organizations or hospitals which serve local populations. They’re very interested in that kinda data too.

JESSICA

We ask all of our guests one question and that is how do you define informatics?

SHARON

I guess generally it’s how information is collected and stored and used and how technology can be used to do that most effectively. And for our field, how that information is used to promote health.

JESSICA

While you were talking, I have another question that came to mind which is how did the name Macroscope come to be?

SHARON

Oh, gosh, yeah. There was a lot of discussion about that and we brainstormed a whole bunch of names and, you know, it’s kind of, like…it’s similar to a microscope, when you look at a microscope, you’re looking, you know, usually just at a slide or the results from an individual patient and the idea behind the Macroscope is kind of, like, you’re looking into that window but it’s not an individual. It’s a whole population. So, it’s at a macro level.

JESSICA

Many thanks to Sharon Perlman and the entire NYC Macroscope team. I found your work incredibly interesting and I think it’s a great example of how technology and informatics are shaping public health practice. If you’d like to learn more about NYC Macroscope, you can check out the articles the team published in the December 2016 issue of iGEMs. You can find it online at this address, repository.edm-forum.org. Then use the search term Macroscope. We’ll also link to the articles from our website which is phii.org.

This podcast is a project of the Public Health Informatics Institute and the Informatics Academy. You can find out more at phii.org or follow us on Twitter @phinformatics. And hey, if you like the podcast, please write us a review on iTunes. That will help other people find out about the podcast which in turn helps other people find out about public health informatics.

This episode included songs composed by Kevin MacLeod. Many thanks to our production team. Especially to Piper Hale, our macro level producer. I’m Jessica Hill and you’ve been informed.

BUTTON

JESSICA

I’m Jessica Hill. I didn’t like the way I said my name. Okay.

Copyright © 2021 Public Health Information Institute | All rights reserved