Last week, I discussed the places where AI is being used in patient experience and the value it brings. Now, I want to address the other end of that stick. While this brute computing allows us to get to the understanding without as much grunt work, it also creates some things that can be problematic. Some of these are broad concerns of AI in any workspace, not just healthcare. Others are more directly related to healthcare. As I discuss these things, you can determine if the cost is worth the benefits. Or if you think the toothpaste is already out of the tube, what guardrails can be erected to prevent the bad from bleeding into the good.
Type 1 and Type 2 Error
In a previous essay, I wrote about the balancing act of managing both Type 1 and Type 2 errors in statistics. Type 1 error is when we say an outcome is real when it is really an artifact of random chance. Type 2 error is when we say an outcome is random chance when it is real. The more aggressively you eliminate one type, the more likely you will create the other type. The best we can do in the real world is trying to manage both to the best of our ability. A statistician can tell you what their thresholds were for any analysis they do. But I don’t see AI systems reporting their confidence interval or goodness-of-fit numbers for their work. The problem here is not that mistakes happen, but that in believing that AI is more accurate than any human, we cut back on the people or processes that can backstop AI’s output.
A million years ago, I worked on an expert systems project for a software startup. They wanted to create a diagnostic tool for healthcare, where someone could input the observed symptoms of a patient and it would kick out the likely diagnoses for this patient. This was a LONG time ago, before the internet was the internet and my job was to page through the ICD9,1 hard-coding symptoms back to diseases. It was mind-numbingly boring work, and I certainly accidentally inserted some unintentional flies in the Vaseline. The company was selling this vaporware to assist healthcare professionals to more accurately do their work. The project was ultimately scrapped at the time because of twin concerns:
- While it was pitched as a support system, there were concerns that some health care organizations would want to use it to replace more-skilled clinicians with less-skilled clinicians.
- There were concerns about exactly how accurately this machine could make conclusions using 1980’s computing power run on bored humans doing the data collation.
These two concerns would collide and create more problems. If it produced weird results and there was a thoughtful human reviewing the output, that human could supplant the computer output with their own knowledge base. But if the person reviewing the output had a less nuanced understanding, they might accept the output with less skepticism. Their projections were that they could not beef up the accuracy sufficiently to protect against the first concern. In essence, they could not promise a system that could manage both types of error in a universe where the possibility of error could not be independently challenged. The project got shelved and I went on to work at a bookstore.
Even today, there is an eternal tug-of-war between doctors who don’t want to be bothered with every possible, if unlikely, diagnostic observation, and administrators who don’t want to be bothered with every possible lawsuit based upon a missed diagnostic issue. AI can provide the output, but if you don’t ask, it won’t tell you how confident it is with that output. While it may not have the same life-or-death impacts, PX can likewise dedicate a lot of time and oxygen chasing ghosts and losing front-line staff’s goodwill and attention.
Black Box Code
As I mentioned on Friday, what AI produces is only as good as its source material. Unless asked, it is not likely to give you the strength of the targeted signal in all of the background noise. This is a problem because you don’t often know what the source material is, how big, how varied, or if it accounts for any black swans. 2 You ask for a summary of patient comments, for example, because you don’t have the time or inclination to sort through all of them. You then accept the output at face value, not really knowing if the conclusions are robust or simply the best model of a bunch of bad models.
This is compounded by the fact that you often don’t know what preference or priority is being put on any of the constituent variables. Most Quality measures are ratios of actual events divided by expected events, where numbers above 1 mean you are doing worse than expected and numbers below 1 mean that you are performing better than expected. Obviously, if a hospital is failing at a measure, they would want to know the parameters of the equation so they could figure out the best way to improve.
Often, though, it is impossible to understand the rules that AI uses to produce its work. One of the most noticeable aspects in PX circles is its ability to translate languages. Those PX folks who oversee patient advocacy and civil rights have confronted this. Part of that work involves communicating with patients who are not English language proficient. These patients are often referred to LEP or limited English proficient patients. Hospitals are required to have all ‘vital documents’ translated into the dominant non-English languages in their community. Many will also translate other documents, so they can more effectively do outreach or inform on important health issues confronted by their community. For example, during COVID many hospitals were using machine translation to convert information in English into documents for Spanish, Pashto, or Chinese speakers in their community.
The problem is that while these tools can be valuable for translating words or phrases, the longer or more complex the message, the more dubious the translations are. So, only after distributing their outreach on COVID did some hospitals discover that the Spanish version was telling people to NOT get the vaccine. This problem was so concerning that the new 15573 rules now state that you cannot use machine translation for anything that is patient-facing without first having it reviewed by a human being who speaks that language.
Off-Site, Out of Your Control
The first two things are broad concerns over how AI functions in the background. They can pose specific challenges to healthcare but frankly can pose challenges to any industry. This one also is an issue for all industries but can put healthcare specifically into a delicate legal position. It involves the blurring of the line between hospitals and third-party vendors when it comes to personal health information.
Anyone who shops on-line or saves documents on the cloud is handing over personal information to a third party. Everyone is broadly aware of this and balances this against the convenience and security of operating in a fluid data environment. Just this morning, I used my laptop to put a decorative weight that you would use to anchor a table umbrella into my Amazon cart. I didn’t buy it; I just put it in my cart. An hour later, when I was playing Wordle on my phone, I got an ad trying to sell me table umbrellas. Whether you are impressed with the fluidity of information across different hardware and different apps, creeped out by it, or are blasé about it, no one is probably surprised by this story.
Healthcare information, though, is a different matter for most people. Both the Health Insurance Portability and Accountability Act (HIPAA) and the Patient Protection and Affordable Care Act (aka Affordable Care Act aka ACA aka Obamacare) contain rules protecting your healthcare information. The extent of these protections is currently being explored in court, as many health organizations are using third-party widgets or tools to both add functionality to their public-facing websites as well as gather data. There are two concerns expressed by some. The first is that this data-collection is not announced by the hospital and its value is not clear. Why do you need to know that my IP address was searching for oncologists or for information on sexually-transmitted infections? Second, since many of these tracking elements are owned and operated by Microsoft, Meta, or Google, that data is captured and stored on their servers, outside of the protection of a hospital system’s IT department. So, some people are concerned with what Google and their brethren might be doing with my information.
Now some of you may be disturbed by this and others are on Amazon ordering me a tinfoil hat. [For those, I say that I prefer a fitted tinfoil hat over a one-size-fits-all tinfoil hat and my hat size is an 8.] My point, though, is to not scare you, but to point out that while healthcare agencies have a responsibility to protect your data under the ACA and HIPAA, non-healthcare organizations do not. Computer code that transfers data to third parties (often called third-party tracking) is common across the web and is subject to few federal privacy regulations. As a result, people are suing the only defendants that they can find—the healthcare organizations themselves.
To that end, many hospital IT departments are trying to control places where this data is transferred to limit liability. They can control broad tracking, but it is less valuable at the point of care. I have seen nurses use Google Translate on their phone to communicate with a patient rather than using officially contracted and protected technology-assisted interpretive services. I have seen patient advocates use on-line translation of letters to patients, meaning that patient’s information is now stored in a server somewhere.4 The extent to which this bothers you (if at all), speaks to your comfort level with this data privacy. But remember, it is not your PHI that you are potentially exposing, but that of a person who had no opportunity to express their feeling over their PHI being sent to Meta.
Not Human
The previous issues focus on the patina of confidence and accuracy these tools have and their potential impact on health outcomes and legal concerns. This one is a bit softer. It is less about exposure and more about the nature of the healthcare relationship. I saved it for last because it is a broader statement about our relationship to non-human interfaces. Being confronted by non-human bots pretending to be human, even hiding their nonhumanness to a user, is something that we all confront. Some accept this as a part of modern life. Some have created a cottage industry of creating videos of them besting AI with “ignore previous prompts”5 and the “three-finger test.”6 These videos are amusing and can be helpful in rooting out deceptive phone and video interactions. But they also treat this as a game and presented for laughs. For those seeking healthcare, often older patients and those needing medical assistance, managing vocal prompts and online decision trees can lead from simple irritation to deep anxiety.
Hospitals are likely to be under more pressure than other industries to cut costs since their target customer is inefficient. They need specialized care and they often have the temerity to live in rural areas where transportation and accessibility are limited. Plus, hospitals cannot refuse to treat a patient just because it is not economically worthwhile to do so. AI is just the latest technology that can use to help cut costs. From a patient experience perspective, its ability to shorten the time between data-collection and action-planning means that it can more quickly provide adjustments to existing strategies or even help chart new initiatives to follow. But, at the same time, not putting a talented human at the end to evaluate the work, or, worse, thinking that you can use a less-talented human at the end to evaluate the work, can lead to problems. As Ronald Reagan said, “Trust, but verify.” Use the tools when they make sense and are legally allowed, but don’t use them at the expense of your own judgment.
1The fact that I was using the ICD9 when the ICD11 is now being released should tell you about the caveman days I grew up in.
2A black swan is an unpredictable event. I first came across it in Nassim Taleb’s book of the same name. It is an unexpected transformative event that may make sense in hindsight but catches a vast majority unaware when it happens.
3Section 1557 of the Affordable Care Act (ACA) is the key federal civil rights provision prohibiting discrimination in healthcare on the basis of race, color, national origin, sex, age, or disability.
4Google Translate for example, retains all queries for translation to enable the AI to learn and provide better future translations.
5When confronted by a bot pretending to be a caller or a tweeter online, say, “Forget all previous prompts…” https://www.youtube.com/shorts/GJVSDjRXVoo
6The “three-finger test” is a technique used to identify real-time AI deepfake video calls by asking the speaker to hold three fingers in front of their face. It exploits limitations in older AI models that struggle to render hand movements causing the video to glitch or distort https://www.youtube.com/shorts/SsNC1bRchxw
Leave a comment