Two studies using PicnicHealth hemophilia data to be presented at ASH 2021 Annual Meeting

November 16, 2021

The American Society of Hematology (ASH) recently accepted two of our abstracts for the upcoming 63rd ASH Annual Meeting, which will be held December 11-14, 2021. Both abstracts explore data from the PicnicHealth Hemophilia Cohort and are published in partnership with Roche-Genentech and key opinion leaders in the hemophilia space, Mark Skinner (McMaster University, ON, Canada), Michelle Witkop (National Hemophilia Foundation, NY, NY) and Amy Shapiro (Indiana Hemophilia and Thrombosis Center, IN). The first abstract describes PicnicHealth’s novel methodology for building patient-centric, real-world datasets and has been accepted for an oral presentation. The second abstract builds on the methodology abstract to characterize patients with mild to moderate hemophilia A - a group historically underrepresented in scientific literature.

Abstract #594: A Novel Methodology for Building Longitudinal, Patient-Centric Real World Datasets in Hemophilia A

This abstract reviews our innovative patient-centered approach to building longitudinal datasets, which avoids data gaps common in other datasets. For example, claims data is confined to billing codes, and registries are incredibly resource intensive to stand up. The study points to the high-quality nature of PicnicHealth’s data abstraction, the volume of medical record data collected and PicnicHealth’s ability to directly engage patients to go even beyond what’s available in the record. A measure of high-quality abstraction, PicnicHealth’s inter-abstractor agreement scores are consistently above 95% for any data type. In terms of volume of data, a median of 50 clinical documents from 11 years were processed for each patient in this cohort. Finally, this dataset is being supplemented with patient-reported outcomes (PROs), which patients respond to biweekly and share information about their bleeds. As of June 2021, the average PRO response rate was 90.3%. These results highlight not only the quality of the PicnicHealth data in hemophilia A, but also confirm the validity of our novel approach to building real-world datasets.

Abstract #2107: Characterizing Mild and Moderate Hemophilia A Patients in the Real World: A Patient-Centric Approach

This abstract applies our overall approach to characterize patients with mild to moderate Hemophilia A. Because these patients are a historically underrepresented group in scientific literature, it is especially important to study mild to moderate hemophilia A in the real-world context to truly understand the patient population, disease burden and healthcare resource utilization. The study’s results highlight a few things: (1) the data suggests that males are diagnosed at a younger age, on average, than females and (2) patients ages 45 - 64 have more clinical (including inpatient and outpatient) visits than any other age group: 0-19, 20 - 44 and 65+. The study draws on these findings and others to conclude that the patients in the PicnicHealth cohort’s characteristics are generally comparable with the Center of Disease Control’s (CDC) data around age, BMI and ethnicity. This again confirms the strength of the PicnicHealth approach in building representative, longitudinal real-world datasets.

All ASH abstracts are available online at

About PicnicHealth’s Hemophilia Cohort. PicnicHealth has built a longitudinal real-world hemophilia data including hemophilia A and B patients with varying levels of disease severity (e.g., mild, moderate and severe). Novel data abstraction models were developed to uncover hard to obtain bleed events (e.g., spontaneous and traumatic), bleed location, and annual bleed rates from narrative text. Additionally, the dataset allows researchers to connect bleed events to patient symptoms, comorbidities, and treatments to help better characterize the complete patient experience. The cohort is available for license by life sciences partners and is currently helping researchers understand burden of disease in hemophilia B to support payer conversations for gene therapies under development; helping manufacturers demonstrate the feasibility of value-based agreements in support of upcoming gene therapy product launches; and being used to construct synthetic comparator arms for ongoing clinical studies. 

Contact a team member today to learn more about our real-world data research cohorts.

00:00 Intro:

[Sydney] Good day to everyone joining us and welcome to today's X talks webinar. Today's talk is entitled the RWE ROI Series: The Transformational Value of Real World Data for Drug Development and Regulatory Decision-Making. My name is Sydney Perelmutter and I'll be your Xtalks host for today.

Today's webinar will run for approximately 60 minutes. This presentation includes a Q&A session with our speakers. This webinar is designed to be interactive and webinars work best when you're involved, so please feel free to submit questions and comments for our speakers throughout the presentation using the questions chat box and we'll try to attend to your questions during the Q&A session. This chat box is located in the control panel on the right hand side of your screen. If you require any assistance please contact me at any time by sending a message using this chat panel. At this time all participants are in listen-only mode.

Please note that this event will be recorded and made available for streaming on

At this point I'd like to hand the mic over to Evelyn Pyper who will introduce our speakers for today's event. So Evelyn, you may begin when ready.

01:20 Webinar Overview:

[Evelyn] Great! Thank you so much Sydney and good morning to everyone in North America attending and good afternoon to anyone joining from Europe or the EMEA region. I'm Evelyn Pyper and I lead Evidence Strategy at PicnicHealth and I am really thrilled to launch our picnic Health 2023 Webinar Series, The RWE ROI.

This overarching theme really came about from the fact that even though real world evidence has become well-recognized as part an important part of the drug lifecycle, many of the conversations that we've overheard happening about RWE, whether at conferences or webinars, still felt very surface-level and didn't necessarily convey the critical return on investment or risk of an action regarding real world evidence. Throughout the rest of this webinar series, each session will convene a unique panel of global stakeholders to explore the RWE ROI from a variety of perspectives.

02:20 Speaker Intros:

[Evelyn] So today to launch things off we have a very distinguished panel of speakers to launch this webinar series. Our focus today is on the emerging and really exciting space of real world data for drug development and regulatory decision making.

To introduce our speakers I'll start with Jesper Kjaer. Jesper is Director of The Data Analytics Centre at the Danish Medicines Agency, and Co-Chair for HMA / EMA Big Data Steering Group. Jesper has 20+ years of experience in data management, analyses and data visualization, having previously worked in academia and the pharmaceutical industry.

He has headed up activities in EU Framework Programmes, TransCelerate Biopharma. Jesper is Co-PI of PHAIR: Pharmacovigilance by AI Real-time analyses, applying FHIR resources to healthcare data in DK for real-time safety surveillance. He is involved in EHDS pilot work and is an EMA DARWIN EU advisory board member.

Welcome Jesper we're really really pleased to have you. Thank you so much.

We also have with us today Noga Leviner. Noga is the Co-founder and CEO of PicnicHealth, a digital health company founded on the belief that empowering patients to gain control of their medical data is the critical first step to improving lives and health outcomes and driving scientific advances. PicnicHealth has pioneered the use of advanced human-in-the-loop machine learning algorithms to abstract key endpoints from medical records at scale with improved accuracy and efficiency. Noga is a vocal advocate for patients’ rights to control their own health data, having founded PicnicHealth in 2014 to help patients manage their medical records after personally facing the challenges of being a Crohn’s disease patient in the US medical system. She has spoken widely on the subject including at the White House. Welcome Noga.

[Noga] Good morning.

[Evelyn] Thanks so much. We will also have joining us today shortly Najat Khan, but I will save her bio for  when she's on our call and will keep you on the edge of your seat for that one. So to kick things off, we want to you know, before we get into a deeper Q&A to really understand the perspective that each of our speakers are coming from, because we've intentionally convened folks that are coming from slightly different angles of this topic. So, a common thread that connects all of our panelists is really a dedication to ensuring that real world data is living up to its potential.

Historically we know that real world evidence has largely focused on more downstream post-authorization used but in line with today's theme each of you are contributing to important upstream applications of real-world data for pharmaceutical Innovation. To start off, I'll just ask each of you to share briefly,  a less than five minute overview of the perspective you're bringing to this topic. And second, what your priorities have been over the past year. I will start with Jesper to provide your regulatory perspective

05:54 Jesper’s Perspective on RWE

[Jesper] Well thank you. So I think it's important for me to start with saying that the randomized controlled trials or clinical trials are the gold standard right. We're not from a regulatory perspective stepping away from that. But we're realizing that real world data and evidence is really a very important supplement, and sometimes even we can observe it is probably the only real possibility. You often find yourself in situations where small populations, unethical situations that are not really suitable for randomization will lead you to using real world data to generate your evidence. But it's very important for us to look at this not as a binary situation, but something where each of them brings their own value.

Now there's a long history of actually using real world evidence. We may have called it something else in the past, like observational studies. Some of them more prospectively than secondary prospective data collection, more than secondary use of existing data. We have a reality around us that is changing where more and more health care data is born in an electronic way and  that gives new opportunities and it would be an absolute waste not to use those opportunities to get better insights into the effects and side effects of treatments and also the value of medical devices. There's in particular the use around side effects obviously, for very good reasons. You will not be able to uncover every single possible side effect in the clinical trial. We can do as much as we possibly can, within the limits and reasonable design of those clinical trials to uncover that-but there will be situations where actually just running the clinical trials will put us in a situation where we're looking with a certain focus lens into the population that will ultimately be treated by the new medication. And for that purpose, over the years we've had the instruments of post approval effectiveness and safety studies that, in many cases, have been the real world data so to speak that have been the foundation for us to make further decisions and evaluations.

Now with these new opportunities, we are strengthening those capabilities by using that data for that. And we're starting to see examples where real-time surveillance after approval actually can be a real thing in the data pools that are available. But it's very important for these data sources that we're able to declare them, that we're able to describe them. I would assume that most data sources can actually be used, but they may not all be well-suited for the scientific questions you're going to look at with them. And-not going into the details of what distinguishes one data source from another-the whole understanding about that not every data source is equally useful, it's just very important to have at hand. And even with the best possible data sources that we know of that have been extensively used and demonstrated their validity and robustness over the years, it’s still very important we actually have the meter data in place to understand these data sources.

We basically need to understand what happened with the data as it was collected, stored, and handled so we can truly understand that what we're looking at is not an effect of circumstances with data handling or new policies but the real situation with the treatments we're looking at. And the real situation we would like to understand is in a real world setting: what are the situations-in all sorts of populations that are now getting the treatment-whether that's by choice of the treating physician, whether that's new indications, whether that is in some cases what we see that research is leading to an off-label use. We would like to understand that continuously, but more importantly, in the real situation of comorbidity and polypharmacy. Really understanding what we couldn't have-all possible outcomes in the clinical trials that we need the real world data to understand.

We'll also need the real world data up front to truly understand disease progression, standard of care, and all of that really is underneath of developing new medicines that can inform us in our decision process. Then I think what we are aiming at with some of the initiatives in Europe with the Big Data Steering group of building both a DARWIN EU Data Quality Framework, the training we're doing and also the data standardization strategy, is really to think about data quality and a standardized approach to this. One thing I think we still need to develop and think about is how can we actually not only make passive secondary use of the data, but really look at how do we move ahead to a Learning Healthcare System where prospective data collection is something that is going to be even more helpful so we ultimately can avoid some of the challenges with secondary use. In clinical trials we have the protocols to really define what to do and then describe whenever we deviate, and why do we deviate, so we can truly understand what is cause and effect in the data we're looking at, but we are not left without opportunity with the secondary use of the data.

That's why we are proposing the Data Quality Framework to better describe that. To have the metadata to understand the ability to use the data and the ability to have missing and wrong data and understand why did that originate. [This is] something we can actually deal with in a clinical trial and prospective protocol-related data collection as well. So that's something I hope will continue down that path, and I'm confident with the initiatives we have right now in Europe, we're going to see a lot of real world evidence being used in the decision processes and primarily fueled by the fact that as you establish the DARWIN EU system, you build the quality framework, and then you expand over the years with European health data space the infrastructure is there. The notion that this data will also be used for developing medicines, both in the pre-approval and post-approval phases, will lead to a different way of collecting that data and using that data for that purpose as well, which is something I'm very much looking forward to happening.

[Evelyn] Thank you so much. I think hopefully I can speak for everyone when I say that the model that you have developed is something I think many other regions would be smart to adopt. And certainly as I sit here in Canada, with our, you know, fragmented data ecosystem, it's encouraging when I hear of others trying to follow in the footsteps of the EMEA when it comes to this sort of data infrastructure. Before I pass things over to Noga to introduce her solutions provider perspective, I do want to introduce Najat who has joined us.

13:27 Najat Introduction

[Evelyn] Welcome to Najat.

[Najat] Thank you for having me.

[Evelyn] Thank you so much. We're really pleased to have you. So for everyone, Najat is the Chief Data Science Officer and Global Head of Strategy & Operations for R&D at the Janssen Pharmaceutical Companies of Johnson & Johnson. In this combination of leadership roles — unique in the healthcare industry — Najat is responsible for overseeing the Janssen R&D strategy, pipeline and portfolio optimization and investment, and for fully integrating data science and digital health end-to-end across the pharmaceutical pipeline to drive transformational innovation for patients.

Welcome to Najat. We're looking forward to hearing a bit more about the perspective you bring to this particular topic in a moment.

[Najat] That's great.

14:10 Noga’s Perspective on RWE

[Evelyn] Great. So Noga, I will pass things over to you, to tell us a little bit about the perspective you bring in terms of what PicnicHealth is and what your priorities have been over the past year.

[Noga] Sure, thanks Evelyn. I'll start with a quick introduction to the company. PicnicHealth is a digital health company that builds deep real world datasets. We are able to get this deeper data in two ways. The first, which I'm sure I'll end up talking about quite a bit, is just working directly with consenting patients. We take a very patient-centric approach, meaning we have a direct relationship with the patient, so we can really go get all the data that surrounds that individual, including from all of their medical providers, but also incorporating, you know, both historical and longitudinally going forward, but also incorporating their reported outcomes and points of view. So that's one key part.

And then the other side is actually from a technology perspective. Having the human-in-the-loop machine learning model that you alluded to earlier in the introduction, which combines the state-of-the-art machine learning with human curation. So while the patient relationship gives us that deep complete view, our machine learning combined with human curation allows us to then go really deep within the records and extract those key clinical endpoints, that really key information, that's in unstructured text in a way that's both high quality and meeting the quality standards-that I'm sure we'll continue to hear a lot about-but also that's possible to scale. That's scalable thanks to the help of the machine learning, so what you end up with is: a very complete, clinically-rich view of the patient, that's longitudinal both retrospective and then of course prospectively following the patient and being able to get at their patient reported outcomes. And because the data doesn't come from a particular site or  type of facility (medical facility) we actually get really quite good representativeness because we are able to include diverse patients who have a variety of comorbidities, disease subtypes, and treatment histories. Because the process to consent and to participate is meeting the patients where they are, like in their own communities at home, it's a much lower bar to participate and sort of lower requirement than what you would see in a clinical trial.

So I'll comment and say that this data, in light of the comments we just heard Jesper make, I'll say this data doesn't replace broad data for something like surveillance. It doesn't replace that really broad, kind of shallower data for something like surveillance, but we see a lot of demand for this richer data, richer real-world data, as well coming from Pharma functions across the product lifecycle from R&D to Market Access, Med Affairs, Commercial teams. And of course relevant today through our work with our life sciences partners, we are increasingly seeing real world data leveraged to understand disease earlier in the value chain and optimize clinical trials.

And you asked about priorities so I'll touch on that quickly. In the past year we raised our $60 million dollar Series C funding which is fantastic. Going into the current environment, we are growing our research program portfolio, connecting with new patient populations, and always growing and continuing to build our industry partnerships.

[Evelyn]  Fantastic! Thanks so much, Noga.

18:31 Najat’s Perspective on RWE

[Evelyn] And finally from the pharmaceutical perspective we have Najat. We heard from your introduction that there's quite a bit that falls under the umbrella of data science and R&D strategy. I'd love to hear about where your priorities have been in terms of real world data specifically and where you're coming from for this topic.

[Najat] Sure, that's great, Evelyn. Thank you so much again and to PicnicHealth for having me here. As you can see I'm trying to get better lighting, but hopefully you can hear me fine, which is what matters the most.

First of all, I want to say, both panelists, the comments are really spot on. You know, maybe just taking a step back from a pharmaceutical perspective and the perspective really focused on how do we make better medicines for patients, right? I mean that's, at the end of the day, the insights: things we can do, like leveraging verbal data, that we couldn't do before, or things we can do better. All are driven to that common purpose of translating it into insights and then a medicine that's for patients. I would say the way I think about real world data is not even just in clinical development. Starting from the very crux of the question that we're trying to answer, what's driving the disease, right?

I mean I think that's where it's really challenging. Like you know even today how much we understand about disease biology. If you think about so many diseases, Alzheimer's, Parkinson's, they're not named after the driver or the causal effect of whatever is driving the disease, but much more so based on the physician that identified the syndrome. So that in itself tells you that there needs to be a deeper understanding of disease biology. So the way I define real world data is of course claims and EHR, but you also have omics, like transcriptomics, genomics, and images, everything that's not clinical trial data. So if you think of the totality of understanding a person's journey, what drives disease, you need to look at the totality of data. And so we are actually leveraging that from the very first part, which is better understanding the drivers of disease and redefining disease, as a result of it.

So as an example, a lot of the audience probably knows about these longitudinal data sets-for instance, Noga was mentioning too-called UK Biobank or Our Future Health, I mean there's so many. But essentially it's longitudinal. You have all of these different data points collected well, good data quality, and what that has allowed us to do is to be able to stratify people that have severe depression to patients that have severe depression based on their sort of genotypic and phenotypic architecture. So really driving precision medicine more into an area like depression, which let's face it, the medicines that we have today work but we could do better. I think we can all agree that we could do better and that insight is then translating into how we design our programs for depression. I wanted to use depression as an example because we talk a lot about precision medicine and oncology but we need to bring that in Immunology and Neuro and so that's just one example.

Another way that we're using real world data a lot is, let's say you figure out what's the target that's causing that protein and you figure out the molecule you need to actually modulate it. The next thing that comes up is how do you run the right trial and design it the right way. And Noga spoke a little bit about this as well, which is, instead of relying on publications or instead of relying on multi-center sort of analyses of the patient's journey, you could now use real world data across the board. Across all of the patients that actually have a disease, then go back and say, "What were the risk factors that were driving them?", and that actually helps you design the protocol in a much more effective way. So you can do certain things such as inclusion/exclusion criteria, you can simulate it using real world data to say, "Okay, is this actually reflective of the patient journey in the real world?", and "Why is that important?".

It’s actually because it's important when you start recruiting the patients. You can't operationalize something if it's not pragmatic in terms of what's truly happening in the real world. And we've done that, like for one of our vaccine programs. For instance, we used 100 million patient lives-de-identified of course-working with external partners and then really understanding what are the risk factors that they'll get this certain infection. And that actually was then translated into the protocol design itself, working with our clinicians and now that study is running in Phase 3. It helps helped us enrich the trial a lot more with the right patient that would be impacted, which is the patients, but it also helped us ensure that a trial is much more efficient so we can get to the data that we need and then hopefully the therapy that we need, knock on wood, for patients as quick as quickly as possible.

And maybe the last example I'll mention is also just a lot of the viewers here probably saw some of the  guidance on external control arms that came out yesterday from the FDA. Super helpful, because the way we think about real world studies broadly-and ECA is a part of it (external control arms)-is you can leverage that to really contextualize what's the right standard of care, how are patients doing in the real world with the physician's choice versus the active arm of a trial.

We're not just thinking about it, we've actually leveraged this for our regulatory submissions, working very closely with the FDA. Early pre-specification conversations are super important, so you're getting feedback on your protocol, on the data elements being super transparent about the pros and cons, all of that. The thing I want to emphasize is no data is perfect and no guidance is perfect. There's always a case-by-case discussion, but this is where early discussions and collaborations and transparency and rigor in what's done becomes supremely important and that's something I want to underscore. Like in my organization, we have the highest highest bar because there's a lot of real world studies, but to do it well takes a totally different level of talent and we'll talk more about that and capabilities later.

And then last thing, but not the least, is also the space around what I would say is clinical trial recruitment. We're using real world data to understand where the eligible patients are and then go to the patients, open our sites where the patients are verses the other way around. And we've seen tremendous impact in terms of both how effectively we can recruit, but then also ensuring that our trials are representative of the patients, that we will treat from a demographic perspective, from a diversity perspective, etc. Using decentralized trials and so forth.

So as I have this monologue for the last couple of minutes-and thank you everyone for bearing with me-the point to underscore every single aspect of how we figure out what was causing the disease, to designing that all-important trial, to executing that all-important trial, to generating that all-important evidence would becomes critical for access that ultimately helps to get our medicines to patients. Every single aspect of it is being disrupted positively using real world data.

[Evelyn] Thanks so much. Yeah, all really salient examples. And I know you mentioned that it felt like a real treat yesterday, that the day right before our webinar, the FDA released the guidance on external control arms. I think it's just reflective of  month-by-month, things are developed. Like over the past year, the amount of guidance that's come out, the amount of real action around providing industry guidance, but also guidance to companies like ours, like PicnicHealth, where it's really showing this is not just a nice to have anymore, it really is something that the expectation is high quality real world data as part of an evidence submission.

[Najat] And Evelyn if I can just maybe underscore one thing that you mentioned. You know all the examples I mentioned, it's not something that we just do internally, right. Yes, we build the algorithms, the methodologies internally, but the datasets and the partnerships, we have accelerated a lot because of the work that, you know, companies, young companies like PicnicHealth, and others are doing. It's a lot of hard work. Let's just put it this way. To take a lot of the data that's fragmented, not always very clean, not standardized as guidance is coming and pulling that together, creating the cohorts-Noga was saying curating a lot of the work in the structured and unstructured datasets-that all becomes extremely important. So, it's a partnership across regulators,  startup companies, and also pharma and biotech companies. It all has to come together. We couldn't do it without each other. It's exciting to hear the examples, but the foundation of it is the work that's being done across the board. So I just want to acknowledge that.

28:25 Why a career in healthcare? (Najat)

[Evelyn] Thanks so much. Really appreciate that call out. And maybe we'll shift gears from our big picture, all the things that we're all tackling, into a bit more of a slightly personal fun question. So a career in this space of data strategy, data science, applications of advanced analytics really requires a unique set of strengths and capabilities that each of our panelists possesses. Arguably these skills are valued and transferable across virtually every sector today. So question for each of you: Why healthcare? Why you do you find yourself here in healthcare using your skills for this type of mission versus you know financial sector or other technology sectors?

I'll start things off with Najat. We'll bounce back to you.

[Najat] Sure. Why not healthcare? I mean, listen, I'm a bit biased maybe. A quick backstory, not to bore anyone, but you know I grew up in Bangladesh and then I grew up in the UK. I came to the US when I was 17. So really lots of continents in a short period of time. And the thing that always moved me from early on, and it helped that my parents were both physicians, is the fact that when you see somebody get better, person, animal (I'm a big animal lover), that feeling is unmatched.

Like I even get goosebumps as I think about when somebody actually, it's not just the patient, you see the impact on the family. Early on, I recognized that was something that would just get me out of bed, like that's my passion. The other thing I also recognized was that the pain and suffering when you actually have a disease, whether it's for the family or the patient, that's common, across any country, whether it's a developing country like Bangladesh, or you're in the UK, in the harshest area, or in the US and some of the communities that are underprivileged.

And the equity in terms of health care is something that we're not where we need to be. So I think I knew early on just based on that. It's just something that's unwritten. I also growing up did a lot of work around non-profit work going to the fields that I remember I was excited about it. My dad was like if you're excited about something just go to the entry level and just experience it right and it was tough and I didn't know how to speak the language and there's so much, but the barrier and the kindness and the empathy that comes from it. I just think it's not something, anything I could feel in any different sector.  And then COVID just puts it on another level right. Like the reality of it, as we saw, that was the most important thing. Health became the most important thing and so just fast forwarding.  

Then the next question was how do you impact and you mentioned data is so important but I mean-I don't know Jesper, Noga maybe can comment on it too-like 20 years ago this wasn't even a very well established field. I remember when I was doing my undergrad as an international student which means you have to keep your GPA super high or else it's a problem because I was on a scholarship and stuff anyways. I did  all of the sciences like physics, chemistry, bio. I also did computer science and I also did econ. So it was just this weird plethora of different disciplines and I remember so many people asked me “Najat, why? Do you just like pain? Why are you doing all this stuff?” And I was like well, you know, I could see the computational aspect would become more important as more data is generated and just like things that we cannot compute in our minds right and science was at the core of what I cared about: science and medicine of course.

The reason I went to a liberal arts college and did econ, I recognized when I was growing up that unless you take all those great ideas and make something out of it like a business that's sustainable you can't really have an impact in a sustained way for patients. So that was the confluence of it. It came like a very organic way and when I talk about it retrospectively it sounds so cool, but honestly at that point I was gonna say, "I'm interested. I think this is going to be interesting". So, then when I did my PhD at Penn I was very lucky. I had a crossroads. I could pick an advisor that was a very core organic chemistry, like you know publish, publish, publish, and then I had another advisor who was new not, tenured, so higher risk but was much more into interdisciplinary approaches. Long story short, my PhD was a confluence of working on physical modeling on what molecules might work and there are so many colleagues in the computer science department that helped me. I learned from them. Then actually coming to lab and making it with my own hands as an organic chemist. So I was developing molecules for detecting early cancer. Early detection of cancer and also therapeutics and then actually working with folks in the Penn med school to translate that into in vitro and in vivo work and that was a very sort of non-traditional PhD but it was very application-focused and had a PI that actually was like, "okay if you do good work, if you don't we're gonna have a conversation." So the reason I'm saying all this for folks that are watching is it's not something I decided on but I think as you progress, now thinking about data science and AI, machine learning, RWE is great, but the question is what do you apply it to. So understanding multiple disciplines of what you applied to that is going to actually have an impact and not just doing it because it's a cool algorithm or a cool methodology is super important if your core mission is to have impact on patients. So I'll just say I could do it in finance but I don't think I'd be this excited if I did it.

34:13 Why a career in healthcare? (Noga)

[Evelyn] Well thanks so much. I’ll pass things over to Noga.  Why healthcare for you? I know Najat just mentioned creating sustainable companies and that felt like a really good segue to pop it over to you.

[Noga] I think in some ways I think my experience parallels yours, Najat, but actually it's kind of the opposite because I got into this a little bit later in life. I actually also will just share my personal story. I got into Healthcare and started and founded PicnicHealth because I'm a patient myself and until I had the experience of being a patient, I wouldn't have thought of applying the skills that I had built over my career to healthcare. I think it's like so trite, like so obvious that it goes without saying, people always say like if you don't have your health you don't have anything, but when you're in the situation where you're actually dealing with that there's just like no substitute. There’s nothing more obvious than if you don't have your  physical, your mental well-being, you don't really have anything else. I was actually a patient and I was diagnosed with Crohn's disease in my 20s and it was just looking around me at the landscape. It was just so obvious that if we could act like from the experience of being a patient and how divorced I was from the research space, how hard it was for me and frustrating it was for me to get a hold of my own data that if we could just bring patients, that there was a sustainable business to be built that would benefit patient and so clearly benefit other parts of the healthcare sector.

I think the bar for improving things in terms of the healthcare experience for patients and ability to participate is sadly so low. There's so little that's happened. The world is changing I think things are getting a bit better, but I think at the end of the day being here in Silicon Valley, you feel like there's people who are like spending their whole careers optimizing l some little thing, like a little button and then you kind of look at the experience patients are having in healthcare and the way we're using data in our system and it's like these broad swaths of opportunity. For anyone who's had any kind of personal experience you can just see that there's like massive massive opportunity for really big shifts in impacts that you can have. That's really kind of palpable.

37:37 Why a career in healthcare? (Jesper)

[Evelyn] Thanks so much, Noga. Jesper, how about you? How did you find your way into healthcare?

[Jesper] By failure actually. I failed studying veterinary medicine  back in the 90s and then I realized while I couldn't really learn another anatomy textbook by heart, it was during the whole internet development and taking the IT route was just interesting. Just by chance, one of my colleagues said I have this database at one of the hospitals that needs a little bit of help, "Can you go and meet up with them?" And then ended up in a research group for the next 13 years. It turned out what we did there was really impactful. There was observational data on a global scale. The DID study was actually funded by industry through the FDA to look at adverse drug reactions for HIV treatment. Obviously that suddenly became really really meaningful as you had that opportunity it also gave you an opportunity to suddenly realize how closely connected the world is in this.

I remember some of the work we did. We actually went to the NIH in Washington and presented to Anthony Fauci about data models and what to do on a global scale for a network that still exists today. And then getting that interest of industry, and now the regulatory world. I just got soaked into it and couldn't let go of it. To me observational data, the real world data, has been part of my work for the past more than 20 years now, so not really getting out of it. It just seems to be coming back in new wrapping. And more of it by the way. So a lot more data than in the past.

39:21 What are the most important considerations for assessing or selecting a data source or partner?

[Evelyn] Well I think we're all glad that you're here, that the three of you are in healthcare…And probably we won't let you leave because there's more work to be done! Jumping back to you Jasper,I saw you speak about DARWIN EU at a conference last fall. It was fantastic. Since then, I saw that the EMA announced the first eight data partners selected as part of the Phase I of DARWIN. These seem to include both public and private institutions from across Europe including Spain, Netherlands, Estonia,Finland, France and the UK. From your perspective, if you could distill down, in the time that we have, what are the most important considerations for assessing or selecting a data source or partner whether it be for the purpose of DARWIN or more broadly?

[Jesper] I mean DARWIN in particular actually has a connection office that proposes this to EMA and The Advisory Board as well. There are lots of factors to be considered, maturity of the data source, the quality management system of that, the whole procedures, and ultimately, at the end of the day, also that you can demonstrate scientific value from that data.

Then this is a network of data sources, your capability to connect with the common data model and then interact in a collaboration around this is obviously also important. I think there's a number of factors, but if you would take one down, is really the scientific question driving the selection of the data sources because we need the relevant data sources to the scientific questions we actually see in the regulatory work. So, you'll see a bit of dynamic around this as we expand into primary or secondary particular disease registries as well. We're talking really rare disease. We have some of the reference networks in Europe we can utilize in a different fashion but it's really true to make sure we get the appropriate data sources to the scientific questions and then obviously data quality and Quality Management Systems around that matter.

But ultimately we are fortunate to have many data sources where we can pick from. So, I think the choice is really to give everyone room and then see it develop over the years to come.

41:40 Are you seeing promising progress in how real world data can be used across geographic context? (Najat)

[Evelyn] Great, thanks so much!  Najat, given you have global oversight, certainly at any given time I imagine you're thinking a lot about what's happening in Europe, what's happening in the U.S and elsewhere. In terms of data systems, historically we're often very confined to these geographic barriers and I'm curious if you're seeing promising progress in how real world data can be used or is being used today across diverse geographic contexts? Any examples of that?

[Najat] I mean, look there's definitely progress being made with different approaches in different places. Maybe I'll break it up into two ways. One is true regulatory grade real-world data and then there is real-world data non-regularly grade. I think for the number two bucket there's quite a bit of progress being made, especially in regions like Brazil, in the LATAM region, in China, and others as well. Sometimes people ask me, "Is that even important?? It actually is. It depends on what question you're trying to answer. If you're trying to answer the questions around-Noga mentioned this, Jesper mentioned-understanding the patient journey. Understanding it early on. What's the standard of care there? So it helps you for the design. I was saying the protocol in a much more thoughtful way with a global mindset, which is definitely critical for us. Also if you're trying to recruit patients, just understanding which sites patients are going to. From a site collection perspective there's a lot of opportunity. I think Noga said, in terms of the bar there's a lot to be done. I always say the flip of it is only 10% of what we make has success. So, there's huge room for improvement. I think from that perspective, especially with LATAM, also some of the Gulf Countries, we are starting to see that they're actually trying to aggregate a lot of their data sets, connect it, link it and so forth. Maybe not at a national level, but much more at least regional levels or key hospitals. So, it's a start.

Let me think about more globalization and having more diversity in our trial, that's definitely helping, so that's one example. I would say from a regulatory grade perspective it's still really…. like I was mentioning before, a lot of the work in the U.S. I think a lot of the funding that's really gone on from VCs and private equity funds, etc. Congratulations Noga on the recent raise. It's really helping us, in the last few years, push to having higher regulatory grade rich data. Because at the end of the day a lot of it comes down to the exact question you're trying to answer.

I think in Europe, Jesper can say more than I can, but with DARWIN and a lot of the other networks it's definitely getting there as well.  Ex-U.S, some of the other regions, I don't think they're at that regulatory grade yet. If I can be just really frank, there are some changes, like in Singapore, there's some really good data sets from Taiwan, but large countries because I think the step one is just getting everything together. Then next step two is that next layer of maturity of actually making sure that your data quality, privacy considerations…. Alot of the guidance is also still evolving in some of those regions as well. So try to be practical about the answer but it's definitely moving in the right direction. But what I want to emphasize is even what's happened today, it can be really really useful in terms of how you develop it.

45:23 Are you seeing anything today that suggests attitudes around data crossing borders are changing?

[Evelyn] Definitely. Shifting gears to Noga.  I think you might have a unique perspective on this coming from PicnicHealth. Are you seeing anything today as someone who's working with life sciences partners that suggests that these attitudes around data crossing borders are changing or should change?

[Noga] I'll leave the whether they should change aside for the moment, but I will say that I think we're starting to see some encouraging signals of progress that there is for openness to using real world data across regions. At the core I think people recognize that, more and more, every data point counts and you just have to find patients where they are.

Among our broader portfolio of therapeutic areas we work in we really see this the most in rare and ultra rare diseases, where every data point, every patient counts. It’s particularly important and what we've found is that what's most important is the ability to be flexible to adjust to what matters in a particular region or context, to be able to get that really complete picture. Then we found that being able to, with low activation energy, go back in and change the data model to meet what's important in a particular region or context, customize it to align with what they're looking for, it really does a lot to give comfort rather than just having a static data set that has to be used in the way it was designed for another region.

Ultimately I think sponsors are using our data to support evidence packages and regulatory submissions both in the US and increasingly across borders as well.

[Evelyn] Thanks, and from this PicnicHealth perspective, as Noga knows, we're often working with real-world data and evidence champions from their respective organizations as opposed to you know the naysayers or maybe traditional trialists. The challenge I think in PicnicHealth is less about convincing folks the value of real world data and these various novel applications, but more aboutarming our main champions with the information. That'll help them have broader internal buy-in.

47:54 What's your experience been with navigating and securing internal buy-in for real-world data projects? (Jesper)

[Evelyn] So a question, we'll start with Jesper and then move on to Najat.  You both champion initiatives that can be considered innovative in their use of data and data sciences. What's your experience been with navigating and securing internal buy-in for real-world data projects and fighting that good fight for real-world data?

[Jesper] Often when it's maybe outside the status quo,  I think what has really been the driver with us has been a strategic approach from the top and then downwards. There's been something that we decided to put into the European strategies, in the Danish medicines obviously also for the past 10 years in our strategy and then I think it obviously helps if you're in an environment like the Danish, where doing registry-based research to inform how you do healthcare and treatment and so forth is just baked into the way you do things.

That has been a major factor in making that change and then through demonstration projects, step-by-step to actually do this in our procedures and our processes. To bake in the use of data in a different way. That's why we established the Data Analytics Center, to have a core function of expertise that would collaborate with the other skills in our agency to use real world data in decision making processes. Whether that's in our pricing department, whether that's in our safety surveillance, whether that's in our approvals, even to the point of our HR system, we can actually do stuff with data analytics matters. Just bake that in everywhere, that starts to change the paradigm. That's the same mechanism with the other agencies in Europe and it's the same mechanism within EMA. It's really something that we from a strategic point of view, have decided to do and that really makes the change.

49:51 What's your experience been with navigating and securing internal buy-in for real-world data projects? (Najat)

[Evelyn] That's great. Najat, you've probably seen changes over time but today, are you still needing to convince the value of real world evidence? Are we past that stage? Can you push the boundaries a bit more around what we're doing then with that real world data and evidence?

[Najat] What I would say is, right now, where we are there are a couple of things. I think it's very similar to what Jesper said. It started very much with, will real world data actually replace RCT? That was the big thing if you think like four or five years ago. I think that was almost as over ambitious an aspiration which was not based in reality. The reality is clinical trial data is extremely important, but we can always say there's more data that you need to have more holistic evidence generation. It’s an “and,” it’s not an "or".

So step one was just to shift that at least, in my experience and like Jasper said, it started from the top. Our head of R&D just had great foresight in terms of the impact of data science that can have end-to-end. Then our CEO, Joaquin Duato, I mean he has said the combination of science and data science technology is going to be what's really propelling us in the future. So, I think having that foresight investment support from the top. Then the other thing we also did which was really important is building a team because to know what good looks like and to actually do great work and to play in a professional arena, you have to have professionals at the end of the day. So, we built it as a function, as a new business unit that actually reports through me to the EC eventually.

The thing that was important as I built out the team, was to get that grassroots because you don't want just a top-down for a large organization. You want the grassroots support as well. You need to have demonstration projects, back to what Jasper was saying, but focus in areas that are actually going to make a difference and in a time frame that's not five years. That's why we started in Clinical Dev and not Discovery because they're taking so long, just being super practical. Everybody can get behind that question, like all different groups, but the other part was actually having bilingual talent that understands clinical science. A lot of them have backgrounds in oncology, neuroscience, and so forth, but also folks that are appointed experts in real world data, AI, machine learning, and so forth.  That creates a common bridge in languages so that you can actually have a discussion at the grassroots level. How would you solve a lot of these problems? For where we are right now, tremendous progress. We started building this team two and a half years ago. It feels like much longer than that. We have data science fit-for-purpose use cases across every single clinical program, so it's totally done at scale today. You'll get the odd questions, "Do we need real world data?", but we have enough demonstration projects where it started off as, "hey let's keep it  for an optionality perspective," but then it becomes really important as you go towards regulatory submissions.

The other thing is it's been done with great care and rigor, not just internally but with our external partners and also with great SMEs. I also always have the team actually review it with third-party SME, so it's like you never want to be drinking too much of your own Kool-Aid. I always say you always want people to look at it. Like maybe this is a grad student in me and I’ll always poke holes into everything because that makes the ultimate output better.

There's a lot of support, but there's always priorities in an organization so it's continuous. It's not ever done. There's always priorities and if there's one bad real-world study that comes up, everybody regresses and that's what we have to be really careful about. If it's not done the right way, same thing with AI algorithms and so forth. That's why I always say we have to work together as a team, internal and external.

54:16 If you had to pick one data or digital development that has you excited for the year ahead because of its potential to transform our understanding of a disease, what would that be?

[Evelyn] Well thanks so much. I imagine, too, as the suite of capabilities that real world data can be applied to is ever-growing, I'm imagining that there's always going to be that iterative conversation around, "Five years ago we weren't using real world data for X, but now we are." So,  how do we make sure it's the quality and the output is what we need to see?

On that note, because we're coming down for time, and we do want to save time for questions from the audience at the end, I'll ask for maybe a rapid fire round of responses from our panelists on this last one. With all these advancements in data science and digital health, it's transforming how we're using real world data as all of you have spoken about today in clinical research and in clinical care. If you had to pick one data or digital development that has you really really excited for the year ahead because of its potential to transform our understanding of a disease what would that be? Noga, I will start with you.

[Noga] That's a great question. I will say having now worked in the sector for 10 years, I think almost every single year actually when I got into this…. I was re-watching a video the other day that I had put together when we were just starting 10 years ago and my co-founder and I were saying, "there's been a regulatory change and within a matter of a year, patients are all going to have access to all their data through APIs." This complete transformation is happening and that was basically the last time that I was optimistic about that. I learned my lesson very quickly, but there have been these glimmers of hope along the way that were actually happening, that there was progress. That regulatory change is going to make a difference and I think the reality is, we have learned not to be overly optimistic. And yet I actually think what's happening right now and the changes that 21st Century Cures Act is bringing on is incredibly powerful and exciting. It's really like we're going to be shifting the balance of power to ensure not only that patients can get their data but also that they're really empowered to choose how to share it.

Ultimately, what that means is fewer barriers to access for patients and much more real-time data, which I think not only matters for patients and their own direct experience navigating the healthcare system, but also for the kind of life-saving clinical research that they want to participate in and choose to contribute to.

[Evelyn] Great and I think a good segue to Jesper, in terms of a regulatory example from Noga. What has you excited as a regulator yourself about this year ahead?

[Jesper] Oh no doubt,  the scaling of DARWIN EI into our regulatory processes is the biggest. But what I'm also seeing is that in our own little country, we are taking learnings from Covid-19 and the use of real-time surveillance to the next level together with computer scientists and building artificial intelligence machine learning techniques on top of that because ultimately the next pandemic may unfortunately just be around the corner. So, if there's an opportunity to learn from that. I'll see some of that effect this year already and we're looking forward to having a kind of a new tool available in our toolbox.

[Evelyn] Great! thanks! And Najat, how about you? I'm sure there's lots that has you excited for the year ahead, but what would be the standout for you?

[Najat] What I'm really excited about is the fact that thanks to ChatGPT, which still needs a lot of work, AI is actually becoming much more mainstream and people are starting to get that this is here to stay, which I think is actually, if done the right way, is really good for everything we're talking about. Because I think one challenge that comes up is sometimes-I'm just thinking for the broader audience-it's hard to understand how all of this is going to change healthcare and the details and the data quality and everything that we rightfully talk about.

I think once you can actually use it and you democratize the use of something and you see how interesting and cool it can be, then the underpinnings of it which is generative AI, generating new thoughts, images, videos, whatever….Think about the future, even happening right now, you can generate many different structures, protein structures, biology. You can generate you know different chemical structures. Because at the end of the day, the English language, or whichever language it might be that ChatGPT uses, in chemistry and biology you also have a language. Whether it's the amino acids in biology and in chemistry it's the different atoms. At the end of the day, I think you're going to see an arc. It's going to take time, to Noga's point, it's going to take time because there's going to be a lot of ups and downs but that gets me excited because that's new novel insights of molecules and structures that we can create and ultimately think about. Even protocols we write, you can actually generate that if you have the right data set so you make things faster so that our clinicians or data scientists can focus on the right areas. That gets me excited because I think it's just making it a little bit more tangible for people and if done the right way, I think it can also progress a deeper understanding and appreciation for what we are all collectively trying to do in healthcare

[Evelyn] Thank you and yes it feels like there certainly hasn't been a time where there's been more dinner table conversations around AI and certainly Covid helped with that. I think everyone now understands…

[Najat] Yeah can I just say Evelyn, it's so funny my dad, who you is pretty engaged, he tells me the other day, "Hey I'm doing this project proposal for this non-profit,:- he does a lot of non-profit work-and in that one of the core focus was AI machine learning. I was like I'm so happy to see that. This is women in underprivileged areas ensuring that they get the right care early on and Telehealth and all the stuff. Then I said to him, "I'm so glad to see that." He's like, "Why, it's mainstream. Why are you surprised?" And I was like whoa that's somebody that would have not said that a year ago. That was a personal moment for me when I heard him say that.

[Evelyn] Amazing! Yeah, valuable data point for us to have as we're all in it 24/7 to get that like oh actually there's diffusion into everyone's lives at this point.

1:01:08 Q&A

[Evelyn] I know we're at time. I would love to keep talking to our panel for  another hour or more but we are at time and I do want to just still give the opportunity for any attendees who are able to stay on the line that wanted to ask questions or hear the responses to some of the questions you have. I will pass things over to Sydney at Xtalks to help us wrap up with that session.

[Sydney] Thank you very much, Evelyn and thank you all for that very insightful presentation. We have time for maybe a question or two.I've already received a few from our audience, so I will start with those.

Our first question is “How to evaluate the method to deal with missing data from RWD?”

[Najat] I will have to drop, but I can quickly answer that question. There are many different approaches in terms of looking at missing data, for automating. First thing is definitely take a training data set and then you look through it and say what are the parameters or elements I'm looking for. There's quite a few methodologies that have been already published, for instance by Sebastian Schneeweiss at Harvard and others. There's actually an approach that we use that framework to say, okay what are the key elements, whether it's demographic or some of the things, endpoints that you're looking at for that question. Always remember the intended use and then we see what is the missing data. Then there's a threshold of how much missingness you can have. Generally we don't go with one dataset, we look at two, three different datasets. A lot of the time, what we do, it's an iteration. We work with our partners back and forth in terms of where we need to curate more, where we need to fill gaps and in parallel we're also having conversations with regulators just to say, okay we have ORR but progression-free survival, there's this XYZ is lacking.

So, it's not like a static approach and I think that's the beauty of it. With almost anything in science, it's pretty iterative but that's what I would say. It’s a lot of what we're doing, there are frameworks and methodologies. I don't think there's one that everybody uses,  you can also use different ones, specifically for oncology versus some of the other disease areas.  I'm sure Noga and Jesper will have many more great insights. Great to see everyone! I will have to drop. Thank you so much for having me, again!

[Evelyn] Thanks, Najat! Thanks for coming. Anything to add from Noga or Jesper?

[Jesper] A quick one from the regulatory perspective. I appreciate the references to Sebastian's name and others and assimilation of how to do that. That's definitely a useful approach to consider. I would just like to point to the fact of the Data Quality Framework we have released where the implementation of that towards real-world data is something we'll do as a demonstration project over the next half year. So, maybe a space to watch to better understand the regulatory perspective around data quality and missing data. Something to watch out for.

[Noga] And I'll just say very briefly that I think because we work directly with the patient, our approach here is pretty different. We actually make an effort to look at the individual patient and, using claims data, understand every interaction they've actually had in the healthcare system. Then we can literally go chase those down one by one and actually ensure that we have that really complete-I think people are calling it "past complete" picture. You're not going to do that on a national scale or continental scale I should say, but I think when you're working with smaller patient-centered cohorts it is actually possible to chase that information down and get to completeness, at least in set windows of time that you can define.

[Sydney] All right thank you all for those answers. We'll get to one last question. So, earlier the panel mentioned the importance of representative RWD. This audience member would like to know if any of the panelists could speak to promising ways that they are seeing RWD being used to study diverse, often underrepresented patient groups and sub-populations.

[Noga] Yeah, I can speak to that. I think…oh man I hate to say this….but as I mentioned earlier in another context that the bar is so low and I think this is another place where we see things like, at PicnicHealth, the cohorts that we build, we have this very deep patient-centered data compared to running an observational study that you have to recruit into or compared to a clinical trial population. You are asking some effort from patients to sort of sign up and make a contribution but it's a five minute sign up that patients can do from their home. So you're not getting that same barrier that you would see for example in clinical trial participation. I think when we see those representative, more diverse datasets, that the reality is that sometimes I think, for example our work in MS,  it's just kind of shocking how little people understand about what the disease even looks like in non-white populations. We know black women are more affected than other populations. We know their disease looks really different, has different outcomes, but I think we're just starting to piece together what that even looks like. For better or worse, I think the reality is that sometimes datasets that give access to a look at diverse populations end up just being about the baseline of getting to the core understanding you have about a disease for a broader population and making sure that extends across the reality of the whole population.

[Sydney] All right well thank you so very much for those answers and for this presentation. We've reached the end of the Q&A portion of the webinar. If we couldn't attend to your questions, the team at PicnicHealth may follow up with you or if you have any further questions you can direct them to the email address on your screen. So thank you everyone for participating in today's webinar. You'll be receiving a follow-up email from Xtalks with access to the recorded archive for this event. The survey window will be popping up on your screen and your participation is appreciated as it will help us to improve our webinars.

Now please join us in thanking our speakers Evelyn Pyper, Noga Leviner, Najat Khan and Jesper Kjaer. We hope you found this webinar informative. Have a great day, everyone.


Sydney Perelmutter:

Good day to everyone joining us and welcome to today's Xtalks webinar. Today's talk is entitled Real-World Data By Design - Incorporating Different Data Types Into Clinical Trials. My name is Sydney Perelmutter and I'll be your Xtalks host for today. Today's webinar will run for approximately 60 minutes. This presentation includes a Q&A session with our speakers. This webinar is designed to be interactive and webinars work best when you're involved, so please feel free to submit questions and comments for our speakers throughout the presentation using the questions chat box, and we'll try to attend to your questions during the Q&A session. This chat box is located in the control panel on the right hand side of your screen. If you require any assistance, please contact me at any time by sending a message using this chat panel. At this point, all participants are in listen only mode. Please note that this event will be recorded and made available for streaming on

At this point, I'd like to thank PicnicHealth who developed the content for this presentation. PicnicHealth is a healthcare technology company that partners directly with patients to build deep real-world data sets. The company leverages state-of-the-art machine learning combined with human curation to port complete medical records into an easy to use online application. The platform gives patients unprecedented access to and control over their medical records and, with their consent, the opportunity to contribute this valuable data to further scientific research. Now I'd like to introduce our speakers for today's event. As the head of Analytics Innovation and Digital Health, Gaelan and his team are responsible for creating and developing innovations across R&D. He co-leads the BMS digital innovation pillar for global drug development, which is enabling a spectrum of digital solutions, including several types of digital trial capabilities. In past roles, Gaelan has led and developed strategic partnerships with large academic medical centers and networks. He has also supported trial design and startup for the BMS oncology and immunology programs.

Next I'd like to introduce Sneha. Sneha is a certified PMP with over 15 years of industry-specific project management, risk planning and mitigation and regulatory experience spanning all phases of the drug development life cycle across a variety of therapeutic areas. As the Global Lead of the RWE Innovation Pillar of Decentralized Studies, her focus is to define the corporate innovation strategy for this pillar through strategic review and landscaping of industry trends and customer needs, engage key stakeholders and guide the ideation process to define and refine the existing innovative and strategic portfolio for decentralized studies. In her nearly decade-long tenure at IQVIA, Sneha has been responsible for providing senior oversight to cross-functional teams across a wide variety of therapeutic areas and study designs with a special focus on rare disease registries and post authorization safety studies.

Andrew Larsen is the VP of Partnerships at PicnicHealth and an industry leader in creating fit-for-purpose real-world data solutions. At PicnicHealth, he works with partners across the life sciences ecosystem to help advance disease understanding and support the development and access to innovative medicine for patients. Prior to joining PicnicHealth, Andrew worked with partners across the life sciences ecosystem to help advance disease understanding to develop portfolio and asset strategies across all stages of development with a specialization in evidence generation needs.

Lastly, Evelyn Pyper is Evidence Strategy Lead at PicnicHealth, a patient-centric real-world data company. Her career in real-world evidence spans the public and private sector as well as regional and global markets. Prior to PicnicHealth, she worked as Associate Director of Market Access at J&J Global Public Health, focused on securing access to HIV treatments in Sub-Saharan Africa, and as RWE Manager of a diverse portfolio of partnerships and research projects at Janssen Canada. Evelyn has a Bachelors of Health Sciences with a minor in Psychology from McMaster University and a Master of Public Health degree from Queen’s University. Now without further ado, I'd like to hand the presentation over to our speakers so you all may begin when ready.

Evelyn Pyper:

Wonderful. Thank you so much, Sydney. Good morning or good afternoon from wherever you're joining us today. We are really pleased to welcome you to this second webinar in the PicnicHealth 2023 webinar series. When the vision for this series was coming to life, it centered around two core ideas. The first was recognition that the conversations around real-world evidence really needed to move past these very didactic sessions on what is real-world data, what are its challenges and opportunities into more kind of topic and context specific conversations. Second, the question of who are the people from across different organizations and perspectives that we'd want to ask to sit down for a coffee and pick their brain on these nuanced topics. With those ideas, the webinar series was born and I'm really excited to have with us here today Sneha, Gaelan, and Andrew to share their unique perspectives on the use of real-world data and innovative approaches in clinical trials.

How has the use of RWD or decentralized approaches opened up new possibilities for clinical trials?

Evelyn Pyper:

With that, we'll dive right in. As all of you know, the overall theme for this webinar series is “RWE ROI”, which is really about going beyond the return on investment of RWE to really considering what's the risk of inaction, the risk of not doing something, when it comes to use of real-world evidence. On that note, if we reflect back on how clinical trials for years have been done in the past versus how they're starting to evolve to be done today, we can all start to appreciate how much might not have been possible if trials had not evolved to incorporate new designs or new types of data. My question for all of you, and I wouldn't say it's a softball question to start off by any means, is what comes to mind for you when you think about how has the use of real-world data or decentralized approaches opened up new possibilities for clinical trials? I'll start with you, Sneha. What comes to mind when you hear that question?

Sneha Kishnani:

Thanks, Evelyn. It's a very interesting question. What comes to my mind is access to additional patient populations. When we think about decentralization of different data collection approaches–patients that may be very ill, patients that may be in rural areas, not necessarily having access to care–allowing and enabling data collection to happen in their workplace or in their home enables additional patients to be part of that research, and we can then strive for representation and diverse representativeness in the clinical trials that we're after.

Evelyn Pyper:

Great. Thanks so much. Gaelan, what about you? What comes to mind when you think about what's possible today?

Gaelan Ritter:

Yeah. I mean, Sneha's right. I think a lot of it is that kind of the flexibility has opened up access, so clinical trials were a very closed world in the past, and you needed to be in a very specific place in your life and in your finances and in your disease state to be able to participate and making things more flexible, adding the opportunities to participate in the kind of ways that work for you as a patient has really changed that dynamic and opened the aperture of who's able to actually participate and how they're able to participate and the amount of effort that it is to be in a clinical trial. I mean, clinical trials have always been and probably will always be more difficult than standard of care to be a participant, but at the same time, that barrier is coming down a bit over time with these new technologies, which is something that makes it more palatable to be a part of a trial. It's really been a nice kind of, it's leading to nice opportunities for patients coming in. Lot of work still to be done, but it's heading in the right direction.

Evelyn Pyper:

Certainly sounds hopeful – feels like the theme there. Andrew, what about you? Anything else to add?

Andrew Larsen:

Yeah. Well, first off, very excited to be on a panel with yourself, Gaelan and Sneha, and looking forward to the conversation today. I think both the points they brought up are absolutely critical. How do we get more patients involved in trials where the actual data is then going to be much more meaningful to really all downstream stakeholders, regulators, payers, providers, and ultimately patients where it's going to be more reflective of the populations that are going to receive the treatment at the end of the day? I think there's another sort of component when you think about the challenges that have existed for trials. They take a lot of resources, a lot of time to pull them off and there's always been this bit of a bottleneck where it sort of comes to inflexibility where know I'm sure if there are ClinOps people on this call, I'm sure they all have their own horror stories with a dozen amendments involved and it's really hard to do, but it's often critical because the understanding of diseases change, the understanding of the population, the treatment, what you're trying to actually answer.

A big part of real-world data is how do you actually create a system with these trials which allows for a lot more ability to evolve what the data can be pulled in to answer new research questions as it changes.  I think there was just a JAMA article or publication last week that basically said one out of five phase III trials in oncology changes their primary endpoint. I'm sure we could spend time discussing that data point the rest of this call, but I think it really does speak to the fact that research questions change–and how you bring in additional data sources, without trying to pivot the Titanic, to actually answer those questions and really optimize your trial from the start, to have that flexibility baked in.

Where are you seeing the greatest need/opportunity for RWD today?

Evelyn Pyper:

Thanks so much. I'm all ready to dive right into things, guys. We're off to a great start. Thinking about all the different ways that real-world data may now be leveraged for clinical trials–from early informing study designs to capturing additional outcomes to even serving as an external control–I'm wondering, Gaelan, where are you seeing the greatest need or opportunity for real-world data today, if you had to prioritize?

Gaelan Ritter:

Yeah. I think in terms of need and opportunity, it really goes back to what Andrew's saying, so I think data acquisition. We've done a lot of work in patient finding and enrollment and trial design. I think that data acquisition is the next kind of frontier for real-world data. The way we collect most of the data and studies today using EDC and capabilities like that is something that's 25 years old at this point, in terms of a technique, and it leaves enormous burden and gaps on patients and sites to be able to participate in that activity.

To Andrew's point, there's no flexibility built into that system. So you need to–I think with real-world data–taking that next step, working on the acquisition side, incorporating more complete medical records into the clinical trial record, images, scans, genomics, wearables and other data sources that we see coming through, that are honestly becoming standard practice in certain diseases, and clinical trials are taking a longer time to catch up unfortunately. Rather than leading new technologies for data acquisition, trials are lagging in a lot of instances, and so I think that's the space where we really need to start accelerating the work in real-world data to be able to bring trials to parity and then kind of push beyond the techniques that you're seeing in standardized practice. It's going to lead to a lot of opportunities to get more information and more learnings out of the studies we run and also make them more efficient for everybody that's involved.

Can you speak to some emerging solutions that are increasingly needed in today’s trial world?

Evelyn Pyper:

Right. Thanks. Yeah, certainly sounds like many of these approaches, they're innovative in the context of clinical trials but not innovative in their own right necessarily. Sneha, from your perspective at IQVIA–which we know is a large organization, with a wide variety of solutions and teams working in this space–does what Gaelan shared align with what you're hearing from customers across the board? And given your title, which is Global Lead, RWE Decentralized Studies, can you speak to some maybe emerging solutions that are increasingly needed in today's trial world?

Sneha Kishnani:

Yeah. I am hearing what Gaelan is saying.  The lack of flexibility is creating burden across the whole system in essence. Not only at your sites with your patients, but with your sponsors. With respect to the emergent solutions, I think participant centricity is critical. Optionality is critical. I just attended the DTRA decentralized clinical trials conference back in mid-April up in Boston, and the consensus was unanimous. There needs to be simplicity in what we are doing and it's really complicated and confusing for all of the stakeholders involved. It doesn't need to be, so I think we need to get back to our roots. We look at what is the core of what we're looking to collect. How do we simplify that for our sites and patients through these methods of different types of data acquisition?

It could be, as Gaelan mentioned, wearables and other devices, but also with respect to the point that sites are not going away, so how do we simplify things for our sites? Really putting our stakeholder needs first. There's more and more use of technology. What we're seeing at the sites now, what we're hearing from the sites is that, "Hey, you're throwing six different platforms at me. I've got a different login for each one." There's a movement in the industry towards things like single sign-on and leveraging things that other industries are doing so well. It's finally coming into our space. But I think we need to continue to move in that direction and bring that kind of rich customer experience to healthcare.

Can you share a bit more about the work PicnicHealth does, the types of use cases this work supports, and what sort of interest we’re seeing from customers related to RWD for clinical trials?

Evelyn Pyper:

Yeah, absolutely. Similarly, Andrew, as VP of Partnerships at PicnicHealth, I'd say you have your finger on the pulse of pharma customer needs and how they might be evolving. Can you share a little bit more about the work specifically that PicnicHealth does for those that aren't aware, the types of use cases that it might support, and what sort of interest you're hearing from customers related to real-world data use in clinical trials?

Andrew Larsen:

Yeah, absolutely. To start at the highest level, PicnicHealth is a patient-centric real-world data platform where we work with consenting patients to create longitudinal complete real-world data across their journey, and that means operating in a site-agnostic manner. So wherever they've received care in the U.S., we're able to procure the full set of medical information from that facility, including medical records, physician notes, labs, imaging, and create a harmonized data set that's deidentified for researchers and also present that back to the patient for their own benefit and really empowering them to be part of their care journey as well. That same connection that we have with the patient is used to also collect primary data in the form of primary or patient reported outcomes where we can actually create a much more holistic view of the patient experience and what they're going through. I think to Sneha's point, a huge part of this is really you can't add 15 different options for 15 different data modalities, and really how do we streamline this process to collect the information that we need that's critical for the study with the least burden on all the stakeholders in the process?

That's really been something that PicnicHealth has strived to do – starting with the patient, where essentially it's really just the sign-up that's the lift on their part to actually create these data sets for researchers and how that's evolved to actually apply to the trial space, so there's obviously a suite of needs, but maybe [I’ll give] just to do two examples. From day one when a participant enrolls, we're able to collect a deep longitudinal history on that patient, such that you can really have a nuanced view of really what is the stratification between all the different patients in the study and tie that back to what the variations and outcomes that you see are, and this is nice to have for a lot of studies, but really critical when you think about the ultra [rare] or orphan space where there are places where you may not be able to actually find enough patients to run a comparator arm and there's no actual natural history that exists. So actually using that either directly, or as supporting evidence for the comparator of those trials when they actually take an investigational therapy, is critical.

Then on the back end it's all about following patients, whether it be due to the burden of the study, they are a potential risk of lost to follow-up, or additionally, it's something where it's like after the site component has ended, how do you continue to capture what happens in the real world? This is both, I think, incredibly important when you think about both regulatory and especially payer applications. When we think about the wealth of accelerated approvals, we've seen that tied to surrogate endpoints. Really, how do we capture the outcomes that are happening in the real world later on? Then additionally, I think there's an aspect here where when you think about physician assessments that don't occur in the real world. Having a single data set that ties together the physician assessments that you were able to administer as part of this study and actually tying that to the real-world outcomes you can see, to extrapolate what these implications are–in ways that HTA bodies or payers may understand the value that this therapy brings to patients in a way that’s more apples-to-apples that they're used to seeing.

We’ve heard terms like ‘patient-centric’, ‘patient-generated’, and ‘patient-mediated’. Can you clarify what these mean and the differences between each term?

Evelyn Pyper:

Thanks so much. I'm a big fan of clarifying terminology, especially in this space when we have lots of solutions, lots of solutions providers, and we hear a lot of terms like patient-centric, patient generated, patient mediated. Andrew, can you clarify what these mean and how you would look at the differences between each term?

Andrew Larsen:

Yeah. Good luck getting a clear definition of patient-centric out of anyone, so I'll tackle that last because that's probably the “squishiest”. But starting with patient-mediated. Patient-mediated is really about when you're working with a consenting patient and that ability to unlock the medical information that they can contribute to research. That's sort of the gateway into that category, though I think the actual spectrum of outcomes from a patient-mediated approach really is contingent on what is the infrastructure and technology that you're using to unlock the information once you have that patient opting in. And I think really PicnicHealth has been built around from day one: how do we take this burden off the patient of capturing this longitudinal journey, and both giving that back to the patient but also enabling researchers as well for that information? Then, patient-generated data, probably the most straightforward of all these. Data that you're generating through engaging a patient, whether that be wearables or patient-reported outcomes.

Then lastly, for the patient-centricity, I think this is something where, as I sort of alluded to, I'm not going to create the dictionary definition, but I can at least talk a little bit about how we think about it internally, where we are an organization of patients as well, so we try to abide by a system that is really built [around] putting them first, and that means starting with them consenting into all of our studies and they have that control over their information. It's also something where we try to–for all the information we collect–we present back to them for their own benefit. Then, also the fact that really our ability to engage and work with patients to make sure that we are capturing what's needed to advance their disease condition, it layers in their input as well through that ability to engage them and collect information, whether it has to do with symptoms, outcomes, quality of life. That's a key component of our operating system.

Evelyn Pyper:

Awesome. Thanks. Maybe that can become the definition? We'll see. We'll try. We'll see what we can do about putting that in the dictionary.

Andrew Larsen:

Best of luck.

If we think about patient-mediated approaches, tokenization approaches or traditional site-based approaches, when do each of these make the most sense?

Evelyn Pyper:

Yeah, thanks. Whether we're talking about patient mediated data or other approaches, I think it's clear that having these options and solutions come into the trial space, it's kind of like a double-edged sword because you have people trying to make decisions on the best approach, but with lots of choice comes lots of questions around like, ‘when is the right fit for each of these things?’. It can be overwhelming, especially for those that are not sitting on the solution side, but really having to make a call with a suite of options in front of them. If we think about patient-mediated approaches, tokenization approaches, or traditional site-based approaches–starting with Gaelan, from your perspective, when do each of these make the most sense or not make the most sense?

Gaelan Ritter:

Yeah. I mean, you make a good point about it kind of being part of the planning exercise, so it really is looking at—we're kind of building out decision trees of when you use these and you kind of maximize the robustness of the data that you acquire, for the least burden on patients and sites that you can acquire it. So when you look at things like patient-mediated and some of the work with PicnicHealth, it's really been a nice opportunity to get large data sets with significant medical history, significant other physician interactions and patient journey without huge burden on patients and sites, and so using capabilities like that primarily is going to be the future of clinical trial data acquisition. Then you go to: okay, I can't get everything from that, so now I'm going to need some things where it's patient input into–whether it's a wearable or an ePRO, eCOA, whatever else–I'll need patient interaction for the next step of things, which reduces some of my burden on the site, but the patient's going to have to be involved.

Then your final step is going to be the more traditional approach, where there's just no other way for me to get this, so the patient's going to have to go to a clinical site, the site's going to have to perform the activity and record the information, and that kind of is the one you want to use as sparingly as possible going forward because it's most labor intensive. I think in the middle of the kind of patient-mediated and the patient-directed and then the traditional site is also going to be where the decentralized trial capabilities come in. That's where you're going to see things like someone going to a local lab for diagnostics, or going to a local imaging center, or going to a general practitioner, rather than going to a clinical site in a major city for their clinical trial. That's where you're going to see some of that middle ground of how we can leverage things that are closer to a patient's home. There's still medical clinics, so there's still burden there. Closer to a patient's home, more convenient to make this more approachable for everyone. It's really just a continuum and you really just try to maximize getting the most robust data you can with the least burden you can possibly do it.

Evelyn Pyper:

I like that. Andrew, would you agree with what Gaelan said? Would you add anything? What are your thoughts?

Andrew Larsen:

Yeah. Absolutely. I think Sneha brought it up even earlier where I don't think the site-based component is going to go away anytime soon. To Gaelan's point, it's very laborious, but it is so critical for a lot of having those controlled settings, being able to administer whatever assessment you want, whatever sort of imaging study, having that controlled environment. It's more about what are the ways that you can create the greatest wealth of data, like balancing that with other modalities in addition to that. I think tokenization is a great way to get a lot of low hanging fruit information, and if that is what you need to answer your research questions, fantastic. It's really about stepping back and being like, what do you need this evidence to show? If there is a tertiary data set that you can link to that has that outcome, just making sure you are taking the consideration: When you think about the waterfall of how many patients will actually be able to match? How many patients will have the temporality coverage that you need? Additionally, if there's a secondary or third data set that you need to link as well, applying that subsequent downstream waterfall from those criteria as well.

And so in large studies with hard outcomes that sort of exist in claims data sets, that's usually still sufficient, but there's a lot of places where I think patient-mediated becomes more relevant when you're considering places like needing coverage across the population. And that coverage is very important to incorporate things that are in the unstructured notes or deep within the clinical text, including things that may have to do with imaging, that may have to do with patient-reported outcomes, or may have to do with those sets of information. And then linking to other things as well that may exist in like cost information, like claims for economic analysis. To say probably the most overused phrase on these calls, it has to be “fit-for-purpose” at the end of the day. So the decision trees you sort of teed up earlier, it's like figuring out for what you need, what is the right suite of solutions.

Where do you see the most challenges with traditional site-based approaches? Alternatively, when do other approaches make the most sense?

Evelyn Pyper:

Right. Thanks so much. Sneha, where do you see the most challenges with traditional site-based approaches? Alternatively, when do other approaches make the most sense?

Sneha Kishnani:

Yeah. It's a good question. I mean, thinking very specifically, we can think about rare disease studies–rare disease, ultra rare disease studies–where it's difficult to bring up a site across a geography, a specific country or region, and each site can only enroll 1-2 patients a year because that's all they have. The industry addresses this by targeting specialty centers where there's a larger population of these types of patients, but it's not a perfect solution. I mean, this is where I think the decentralized approaches can really help allow patients to, like Gaelan was saying, get their labs done through the local lab, imaging done through their own imaging centers near their houses.

I think the other thing that we also need to take into consideration is that the geographical landscape of all of the solutions that we're talking about, sometimes the regulatory landscape of that is the regulators are a little bit slower to adapt to some of these solutions that we're talking through, so that also needs to be considered. And in that decision tree that Gaelan was talking about, consider what is the relative risk if we are to take an innovative approach or an innovative data collection acquisition solution here, and is it something that the regulators could maybe be open to? So having those connections and having those pre-discussions with the regulators is also critical in this case.

How does the timing of evidence generation planning come into play in the work that you do around decentralized trials and what, if anything, needs to change in order for folks to see greater success with these approaches?

Evelyn Pyper:

I'm imagining that…. there's a lot of external factors as you mentioned, so even with a very well-thought-out, comprehensive real-world data strategy from a researcher, from industry researchers, the factor of timing is also always going to come into play. So Company X has an urgent need for a data set, they need evidence in six months from now. Probably not an uncommon request for some of you to get. Sneha, how does the timing of evidence generation planning come into play in the work that you do around decentralized trials and what, if anything, needs to change in order for folks to see greater success with these approaches?

Sneha Kishnani:

We've all been there, right, where like you mentioned, there's an urgent need to fill a data gap and we need the data yesterday. So the pre-planning, the early planning, the sooner you can get the right players in the room to start talking through that evidence generation and evidence even dissemination, the better. Before your Phase III study begins probably is the right time. Even earlier is better. But I think that's really it, I don't know that I need to be verbose about this, but early is good, earlier is better. I think that there are all kinds of issues with the lack of data being available. There's cost concerns to the sponsor. There are concerns around product uptake during launch. It creates all kinds of knock-on effects, so if we can be proactive about those conversations, we can avoid some of that and we can make the whole research gathering process a bit smoother.

Does having related questions coming from across multiple parts of the organization impact what data sources or approaches you use, and if so, can you share any examples of some common cross-functional RWD needs?

Evelyn Pyper:

You made a really good point. It's not just about ‘early’, but it's about getting the right people in the room early, which can also be part of the challenge. So like within a biopharma organization like yourself, Gaelan, I know we're seeing this cross-functional decision-making happening and thinking about how data can be used to serve multiple groups. From your perspective, Gaelan, does having related questions coming from across multiple parts of the organization impact what data sources or approaches you use, and if so, can you share any examples of some common cross-functional real-world data needs?

Gaelan Ritter:

Yeah. Absolutely. It certainly impacts it. I mean, the reality is that high quality, large real-world data sources are still rare in a lot of instances, so there's a huge amount of cross-functional need for these data sources, and some obvious examples are going to be ones that no one's surprised by. Long-term patient safety and efficacy is an easy example. That's an important endpoint for clinical trials. It's an important factor for the kind of progression of trials from Phase I to II to III. It's also hugely important for the patient safety monitoring teams and for the health economics and outcomes research teams and for your field medical colleagues. You see everyone asking the same question, and in the past, one of the things we found with normal traditional clinical trials is that the trials weren't designed to answer the questions of some of the teams that came downstream. It was designed to answer the questions of the specific person designing that specific trial.

To the points that everyone's made, that world is a little bit gone in the sense that everyone now recognizes that there's other users for these data sets inside pharmaceutical companies, and the need is to really expand the capabilities by acquiring the data that you need in the different locations and then leveraging that across everyone involved in the pipeline of that assets journey. So, patient safety is an easy one. Patient journeys. I mean, Andrew mentioned tokenization and the value of linking the claims data sets and other things. When you look at patient journeys for years, those have been used in outcomes research and in those types of capabilities, and we're seeing more and more of patient journeys creeping into the kind of earlier development cycles of assets. You're thinking about: How do patients experience the drug that I'm creating? How's it going to change the treatment pathway? How's it going to impact the way medical practice is handled in some of these settings, especially in rare diseases?

One of the keys is better understanding that patient journey. Understanding the journey that exists today. Understanding the journey that's going to be created from the asset that you're hoping will be successful, and if you're seeing a lot of cross-functional use. It's great you're seeing teams communicate with teams that they never did before because they're realizing that they share common interests. And that also means that then on the real-world data side, you're seeing demand for data sets that can accomplish, that can serve multiple people's goals. And that might mean incorporating data into a clinical trial data set that wouldn't have been there before, or incorporating some of those patient-reported outcomes into a later-phase study that wouldn't have classically done them because it's not the primary efficacy endpoint for the trial. But being able to get those learnings means that everyone can cross-use some of those data sets.

Evelyn Pyper:

Right. Thanks. I mean, curious, do you think that we'll get towards the kind of ‘unicorn’ data sets, where they are serving all stakeholders? Or do you feel like this is aspirational at the moment?

Gaelan Ritter:

We're getting close. I think it's going to be a while until you really get there. I don't think your phase four safety studies are going away tomorrow. I think we're getting closer to being able to answer the questions of multiple teams, but there are still pieces missing at different journeys. And I think a lot of that goes to Sneha's point about, unfortunately, as you're building out the depth and completeness of real-world data sets, you're also realizing where your gaps still are and you're realizing how often you rely on some of those traditional clinical techniques in your trials and in your data acquisition, and so we're getting closer and then also realizing just how big the gap still is. I think there's a lot of room left to make improvement there, but absolutely it's moving in that right direction.

Can you share some examples of how patient-reported, caregiver-reported, or clinician-documented data are being used in clinical research today?

Evelyn Pyper:

Yeah. I guess the more we learn, the more we know what we don't know and still need to fill. As we are moving towards more of this kind of personalized healthcare vision, precision medicine, we know that the reality is that decisions–whether it's regulatory, payer, HTA decisions–might not always be made with the same types of coded data that we've seen in the past, so across all types of studies, clinical studies, post-launch studies, there's a growing need we know to go deeper with the data. Andrew, I'm wondering if you can share some examples of how these emerging–and maybe not so novel anymore, but in some ways novel–patient reported, caregiver reported, or clinician documented data are being used in clinical research today?

Andrew Larsen:

Yeah. Absolutely. Maybe starting with the clinician-documented. Coded information is usually the backbone of a lot of go-to sources, but I think there's a growing appreciation for the sort of wealth of information that is buried in the physician notes underneath the just general coded information, where it's really all about understanding what is either under coded, non-coded or inaccurately coded. And so that can be things about just understanding what the patient population is, subtypes, disease severity. It can be about understanding outcomes for that, so either physician assessments, major sort of signals of activity, disease activity, symptomology, as well as treatment use and effectiveness including mentions for treatment switch or discontinuation. Really I think this is a backbone of a lot of the different studies that PicnicHealth engages in, and I think it's….

There are a lot of places that we expected the value to emerge, but there's even some of the sort of less ‘cookie-cutter’ value adds have even been about understanding the sort of inefficiency of the coded information. So we published in lupus nephritis about the ability to actually more accurately diagnose around 100% more patients by not just relying on ICD-10 codes alone. But going by additional information across the medical record, including narrative text, as well as the shortcomings of relying on those patient populations that are coded today in hemophilia B, where at least 40% of them had an inaccurate hemophilia A diagnosis, a third of which didn't even have any mention of hemophilia B. I think it's actually truly understanding that the patient and their outcomes really relies on going deeper within that information, and then the ability where you're actually talking about a patient-centric platform is like, how can you bring the input of them and their care team into this equation? Which is nothing new for the field, as this has been growing steadily over the past decade where the majority of regulatory submissions now include some component of patient-reported outcomes data, whether it's about symptomology, functionality, quality of life, and those are increasingly making their ways into the actual labels of products.

The way that sort of we think about another unique application of it as it relates to trials is you'll have these assessments done that are patient-reported as part of the study, like for actually understanding bleeds at a very frequent rate is key for a lot of hemophilia studies. And what we've tried to recreate in our real-world data studies is biweekly bleed patient-reported outcome assessments to create the ability to track these outcomes that are never really assessed in the real world. Like you can go in the documented physician notes, but they'll never be as comparable to what you have going on in the trial, so I think that's another major advantage as you sort of look to more holistically understand how trial populations compare to the real world.

Then additionally I would say for the caretaker, this has been a growing focus of PicnicHealth over the last year. A lot of the major areas of therapeutic development are beginning to appreciate that the impact that the therapy brings is not just on the patient, it is on the entire care network of that patient. We are doing caretaker-reported outcomes in Alzheimer's, Huntington’s, a lot of pediatric conditions where it's about: What is the burden that this is bringing on as a societal for the entire care network? How do we actually characterize that? Because this is part of the holistic value that it's actually bringing to society, of here are all the other people afflicted that are absent in almost any data set you look at, that actually, there's a huge way for these advancements to bring relief to the sort of families in addition to the direct patient afflicted by these conditions.

Can each of you share one example of something you think will look very different in 5 years, when it comes to use of RWD and alternative approaches in clinical trials?

Evelyn Pyper:

Thanks. We're getting a lot of great questions in. We want to make sure we have time for the audience Q&A, but I do want to wrap up with a prospective looking question. We've talked today about how changes and how drug therapies are developed and studied, continues to evolve, and to wrap up this main portion of our Q&A, can each of you share one example of something that you think will look very different in, let's say, five years when it comes to the use of real-world data or alternative approaches in clinical trials? Like what's your five year outlook? We'll start with Sneha.

Sneha Kishnani:

Five year outlook, I'm going to maybe answer this question with a little bit longer time horizon, but I think patients are becoming more knowledgeable and more empowered about their data and I think patients are really the linchpin here in terms of getting access to data. If we can continue to educate them on the value proposition of the data that we are using of theirs and the methods that we're using, and then return those results back to them in a way, I think that five years from now we're really going to see an unlocking of all of the types of data, and to Gaelan's point, “closing the canyon” of the gaps that we're seeing. Maybe in five years. Maybe the time scale is a little bit longer on that, but I am generally hopeful that knowledge is power and it is going to lead us to an overall healthcare system situation of patients being able to be managing their illnesses better, one, but then also leading to just a healthier population. Because I think if, I'll just tell you if it were me, if I am getting information back and insights back about my health and my health status, if I can make changes to improve my health and my life, I'm going to be making those changes, so I think that there's something there.

Evelyn Pyper:

I love that. Andrew, what does your crystal ball say for five years from now?

Andrew Larsen:

Yeah. Also love that. Clearly biased as our entire platform is built around empowering patients with more of their medical knowledge, so very excited to see that growing as a focus about how we make patients more front-and-center of their own care journey. Not to sort of echo what Sneha and Gaelan have already brought up, but I think integrated evidence planning, while not new right now, is really growing. And I think hopefully less and less there are going to be people showing up at mine or Sneha's doorstep asking for a data set that doesn't exist and needing it yesterday, as there's sort of this more forward thinking about cross-functionally of what we need. I think that's going to play out in more initiatives to future-proof the study by having part of the consent forms ease ability to collect longer term real-world data associated with these participants.

I think sort of the aspects of the planning around it may shift as well. If you think of just more of the broader landscape across drug development, there's obviously some legislation changes that are going to shift a little bit about the focus and urgency that's around the timeline that you need evidence–to not only get approval, but once that approval is there, how do you maximize what you can communicate to all of the stakeholders involved in the care journey? What is the evidence you can show up to payers with day one vs. one year vs. two years–that we used to have a lot more flexibility for that timeline for. Which shows the tangible benefits of this therapy in ways that they can understand. And also bring that same pace to patients and their providers to make sure that they can make the right decision of whether this therapy is a choice worth considering. And so I think there's going to be a lot of evidence planning that is really about how do we ensure that all the stakeholders in the continuum here are going to have the evidence that they need much faster than we've historically considered what is permissible?

Evelyn Pyper:

Yeah. I like that too. Gaelan, final word, what does that five-year future look like from your eyes?

Gaelan Ritter:

I love both of the ones that came before. I think those are exactly on point. I guess for me, the other thing that I might add to it would be I think our kind of patient recruitment tactics and patient enrollment tactics are going to look totally different in five years. I think the advent of using patient journeys and claims data and new world EHR sources to find patients in their medical records and be able to identify best-fit patients for trials is emerging so quickly that I think in five years it's going to become the dominant way that patients are identified for clinical trials.

I think the days of, well, when I was a clinical research coordinator at Georgetown, the days of sitting there on Friday afternoons going through all the charts of the patients coming in, going through many charts of patients coming in next week as if I could get through all of them to find patients for trials is going to be over. And it's going to be kind of targeted lists using real-world data and other data sources and algorithm development to be able to create target lists for sites to be able to really understand what the flow-through of patients is, who's best fit for the trials, and really being able to triage that out is going to drastically change the landscape of how sponsors function and also how sites function in that kind of enrollment space.

Evelyn Pyper:

Thank you. Yeah, certainly the possibilities have me excited. I think it has people in the chat excited. We have some great questions coming from the audience, so I will pass things back to Sydney to facilitate this next, final part of the session.


Sydney Perelmutter:

Thank you all for that very insightful presentation and I would like to continue to ask our audience to continue sending in their questions for the Q&A portion of the webinar. But I've already received many questions, so I'll start with our first one. Our first question is, what are the current challenges in reconciling different RWD collected using a variety of recording techniques and technologies within RWE studies?

Gaelan Ritter:

Who wants to go first? I can jump in with some of the challenges that we see, I guess.

I think one of the things we see is that disparate data acquisition sources can have different….it's not that the data's conflicting, it's that the data's collected with different intent, so if you had a patient-reported symptom versus a clinician reported symptom, those are going to look different. Even though they might be getting to the same underlying part of the disease, they're going to be represented differently and we see that across different types of real-world data. So you'll see a very different kind of catalog of information from a patient-entered source versus a site-entered source. We see the same thing when you start looking at the kind of linkages between datasets that have different intended end uses. So a claims data set is going to represent a disease state very differently than an EHR dataset because they're intended for different purposes.

We actually see a lot of data that needs more, rather than cleaning, it needs concordance. And so we see that a lot more often now as you're starting to, like Andrew mentioned, starting to tokenize and link different data sets that had different reasons for existing. You're starting to kind of see the emergence of a lot of need for harmonization across those, and that takes a little bit more sophisticated effort than just a mapping table in controlled terminologies. So I think that's something that's emerging more for us as we start to get into these more complex, compiled real-world data sets.

Andrew Larsen:

Yeah. I think that's all right. And what Gaelan teed up earlier where it's like data acquisition is one of the challenges. I think the first step for PicnicHealth at least, a lot of because we collect medical records across any care facility, rural PCP versus like high-end academic institution, and part of what we have really put a lot of thought and energy to is like, how do we actually harmonize these data sets, these different disparate data sources into a single usable data set? I think as we've overcome that challenge, the exact next one is sort of concordance and it's like how do you reconcile all of these different pieces of information with different intent? I think our sort of fallback is there is no silver bullet. It's very much like, what is the question you're asking and what are the pieces of information that should be indexed on to best actually reach a defendable conclusion? I'll stop. I'll end it there and I'm sure there's probably a lot left unsaid there, but happy to discuss further if interested.

Sydney Perelmutter:

Thank you for those answers. Another audience member would like to know, what are your thoughts and what are you seeing with sponsors using registry or open access patient journey data as quote unquote "placebo group" for open-label studies?

Gaelan Ritter:

I'll start this one too, I guess. One of the things…so we are seeing more and more of that. You're seeing a lot of [this] especially in rare diseases. If it's hard enough to find patients to participate in the trial and treatment arms of some of these diseases, being able to find patients to participate in control arms is next to impossible. And especially we're seeing more and more that obviously clinical trials are becoming a therapeutic alternative, especially in rare disease and in late stage disease, and because of that, in those assets you literally don't have many patients that aren't part of clinical trials. We're seeing a lot more use of those data sets both in the classic kind of external control space, but then also seeing them being used as–whether it's digital twins or other types of capabilities that are emerging–to be able to provide insight about the patients on the treatment arm of the trial without having to classically enroll like you would for other trials.

You're seeing that emerge in a lot of ways, whether that's in hybrid control models, synthetic models, and like the digital twin model. It's actually been very beneficial though, to be totally honest, because getting drugs to patients that are in rare diseases is critically important, and the timing is so frustrating because delays in those–there are no other alternatives for those patients–so delays in those asset journeys literally impacts people's lives, and so being able to kind of build bodies of evidence through new statistical techniques has been a real benefit to a lot of those patient groups.

Sneha Kishnani:

Yeah. I mean, I can add to this one. We do see this all the time with respect to disease registries as well as natural history of disease studies that are also being run in addition to any kind of broader registry. I think that there is an acceptability around it and it's happening more and more.

Andrew Larsen:

Yeah. I would just say I think at least for the last three years, every year there's been at least one drug approved with this sort of approach, so I think while regulatory guidance still always indexes on preferential sort of the ability to have a classically randomized controlled trial with a placebo or standard of care arm, I think this is something where when you actually…you have to consider on a case-by-case basis is like, is this truly a landscape where the sort of unmet need and the actual feasibility considerations for a trial justify this? I think we've consistently seen an openness to that, which is great for patients. I think then there's getting that initial access to patients, whether it's an accelerated approval potentially, and then there's the follow-on confirmatory studies, which can take a very long time, right? But I think usually there's that consideration where it's like, let's do the right thing now based on this body of evidence that we feel comfortable doing it, and then confirmatory studies as appropriate for any outstanding questions.

Sydney Perelmutter:

Thank you. Again, we have a question for Gaelan specifically and then we'll do one more question. That question for Gaelan is, can you talk about innovative ways that can help to identify the best participant for a research study?

Gaelan Ritter:

Sure. Absolutely. We're doing a lot more with using access to EHR records and to claims records to be able to map where patients are. Some of that is kind of just looking at population dynamics and heat mapping and then taking the next level and deploying algorithms that actually search medical records for patients that meet not only the I/E criteria of the study but look like patient groups that we've seen in previous trials that would benefit from those studies. So it's really kind of about leveraging that EHR data access and the real-world data sources that we have and being able to create algorithms that look for those best fits. It's a lot of the work that the clinical research coordinators and investigators do manually today, which takes an enormous amount of their time, and so being able to automate some of that and present them with those best fits has been a real advantage and a real opportunity for all of us.

Sydney Perelmutter:

All right. We have time for one more question. We've talked comparisons and differences across patient mediated vs. tokenized vs. traditional site-based approaches, so can you all share some examples of if or how these approaches can be used together?

Evelyn Pyper:

I guess maybe a question to that question that I would put out there is, if the spectrum between site-based and decentralized has infinite points in between it, what does that kind of ideal hybrid model look like to folks on the call? It feels like it could be so many things. The way… patient mediated data is not the antithesis of collecting site-based data, so what does that ideal hybrid look like?

Sneha Kishnani:

I think there is no ideal hybrid. Right? I think that's the beauty of ‘hybrid’ is it can be what it needs to be for whatever research question we're trying to answer. It's having the awareness around, what are the different data types and data sources out there. And having the deep insight and the deep knowledge into: does that data source have the information that is going to help us to answer the question that we want, and where do we need to get the data from? So in essence, looking at a protocol, looking at the study objectives, the endpoints you're after and understanding who is the right source of all of this data? Does it need to come from a site? Can it come from the EMR? You know, et cetera. So really understanding deeply where that data lives and who can best provide it in an accurate and high quality way.

Andrew Larsen:

Yeah. I echo that same thought. I mean, it has to be fit-for-purpose. What are the questions? What are the data sources that support it? And really think critically about the feasibility of what you're considering, right? Just because a data source has ‘data element X’, are you actually going to get the coverage you need or the temporality you need for the patient populations that match? I think being diligent pays off and makes this all a much easier process downstream.

Sydney Perelmutter:

All right. Well, thank you very much for those answers. This concludes the Q&A portion of this webinar, and if we couldn't attend to your questions, the team at PicnicHealth may follow up with you, or if you have further questions, you can direct them to the email address on your screen. Thank you everyone for participating in today's webinar. You'll be receiving a follow-up email from Xtalks with access to the recorded archive for this event and the survey window will be popping up on your screen and your participation is appreciated as it will help us to improve our webinars. Now, please join us in thanking our speakers, Gaelan Ritter, Sneha Kishnani, Andrew Larsen and Evelyn Pyper. We hope you found this webinar informative. Have a great day everyone.



Related Posts

No items found.

About PicnicHealth

Empower people to own their medical records. Advance medicine. We’re a passionate group of doctors, patients, data nerds, engineers, and builders, who believe in making something real that changes lives today and in the future.

Let's Talk


patients onboarded to platform


medical visits processed


facilities provided medical records


healthcare providers


research programs


published posters and manuscripts


partnerships withtop 30 pharma

New Research

Discover how PicnicHealth data powered medical research in 2021

Keeping Patients at the Center

This year, experts from PicnicHealth joined podcasts, webisodes, virtual summits and much more to speak to the importance of patient-centric approaches when building complete, deep real-world datasets.

LC-FAOD Odyssey: A Preliminary Analysis, presented at INFORM 2021

Data from real-world medical records:

(from 13 patients with LC-FAOD)

16 yrs old

Median age at enrollment

38% Female

15 providers / patient

7.5 years of data / patient

Data from patient-reported outcome (PRO) survey

(from 13 patients with LC-FAOD)


patients onboarded across 19 conditions


medical visits processed


healthcare providers


Facilities provided medical records


Change Champions onboarded


Research programs


published posters and manuscripts


partnerships with top 30 pharma

Register today!

A First Look: Lupus Nephritis

Cohort Overview. Understand patient healthcare utilization throughout disease history with ability to probe for meaningful mentions and events.

Open PDF

Sickle Cell Research

Sickle cell (SC) is the most common inherited blood disorder in the United States. Red blood cells become rigid and shaped like crescent moons, preventing oxygen from getting to parts of the body. This can cause fatigue, severe pain, organ damage or stroke.

Open PDF

Lupus Nephritis RWD

Addition of Narrative Text Abstraction to ICD-Based Abstraction Significantly ImprovesIdentification of Lupus Nephritis in Real-World Data

Open PDF