Why screen scraping still rules the roost on data connectivity

A deep dive into data aggregation with a focus on regulatory challenges, common standards for data sharing and the transition to APIs

At the start of the new year you made a promise to yourself that you’d do a better job of handling your finances. So you do what any tech-savvy personal finance guru would do, you get an app that helps you achieve just that. It could be any personal finance data app, but it helps you achieve a similar goal of analyzing your spending habits and offering you advice on how you can spend your money more effectively. And fortunately, you don’t personally have to think about screen scraping, APIs or any of the back-end functionality – fintech innovators have this under control.

Working in the background, there are companies called data aggregators, including Plaid, Envestnet | Yodlee, Finicity, MX and others, that are responsible for safely transmitting your account information between financial institutions and to third-party apps. Aggregators make it easier for customers to transfer their data to third parties, but there are concerns around the security and reliability of the connections, particularly when they’re carried out through screen scraping.

But how do these apps collect and access your data? Typically, the app would ask for permission, and then you grant the app access to your data, this Medium article explains.The companies then access this data through one of two common methods: screen scraping or application programming interfaces (APIs).

Let’s pause for a second and break down what these jargon terms mean. A technique called screen scraping allows you to sign into your bank account and transfer your banking information to another app. What’s happening is the third-party app, through aggregators, logs into a banking application as if it were the customer, “scrapes” their data and pastes it into their own platform.

Meanwhile, APIs are seen as a safer way to transfer data, functioning as pipes that connect software components. API standards incorporate security, rights and permissions. In other words, an API connection allows computers to talk to each other utilizing a common format, this FDX report explains.

Wait a second, why do I care? 

It’s no secret that when done correctly, APIs allow consumers to have more control over who has access to their data while also being more secure, stable, accurate and fast, according to the report. But for a variety of reasons, screen scraping, which is pretty unregulated, remains the vastly more popular tool used today in the U.S.

Let’s take a look at why we’re still using this tactic, why regulators aren’t putting on some elbow grease and regulating more and how this inevitable shift to APIs could lead to a world of “open finance.” We’ll also explore where banks and aggregators stand on why the industry isn’t moving faster to adopt API-based standards.

Pulse check: Context & Challenges 

A report by the Financial Data Exchange (FDX), a company that unifies financial institutions around a common standard for data sharing, found in a survey that over 12 million US consumers have transitioned from screen scraping to a version of the FDX API. Between 65 to 85 million consumers are still provided through shared login credentials and screen scraping for data access and sharing, FDX estimates.

There are numerous reasons why screen scraping is still so common. Screen scraping is a viable technology when practiced by responsible actors, but largely it isn’t regulated or supervised, said Brian Costello, Envestnet | Yodlee Vice President of Data Strategy & Strategic Solutions. Envestnet | Yodlee is a data aggregation and data analytics platform.

“The risk is that if it’s a bad actor that has access to those credentials, they’re going to access way more data than they need to power the use case, they might not safeguard that data properly [and] they might not have the governance in place to manage that data properly,” Costello said.

If we know that APIs are essentially a better method than screen scraping, why are we still relying on screen scraping so heavily? Well, again, it’s for a number of reasons. One of them being that screen scraping is much more accessible for financial institutions and consumers, whereas APIs are really expensive to implement, Plaid’s Policy Lead John Pitts said. Plaid’s technology enables apps to connect with users’ bank accounts.

Outside of North America, governments have led the march toward API-based data access. The European Union’s PSD2 regulation offered a push toward government-mandated API-based data access. Three years ago, the U.K. government tasked the Open Banking Implementation Entity with designing open banking technology and secure APIs in the UK, making open banking a requirement. The initiative forced the U.K.’s nine largest banks to release their data in a safe, standardized form so it can be shared more easily between authorized institutions.

Taking into consideration how many resources it takes to develop APIs, think about the more than 10,000 banks and credit unions that we have in the U.S. Some of these smaller financial institutions simply don’t have the budget to implement this technology, Pitts said.

Also, there are risks when transitioning from screen scraping to APIs, including that consumers may lose access to some of their data.

“When you design an API, you actually have to specifically identify which data points the API is going to provide,” he said. “In that transition, there’s the risk that you don’t include a data point in the API.”

For example, the UK ran into this exact issue when transitioning from screen scraping to APIs, Pitts said.

“That API [in the UK], because it was designed to meet a legal requirement, doesn’t give consumers access to anything other than their payments accounts. So if you have a mortgage, if you have a savings account, if you have an investment account, the API doesn’t provide access to those accounts,” Pitts said. “The consumer actually lost access to some of their data when they transitioned from screen scraping to API, because the API didn’t give the consumer access to those accounts.”

So, transitioning to an API world is going to be harder than we thought. But, not all hope is lost. The industry is working on it.

For instance, Plaid said that it’s set a commitment to have 75% of its traffic dedicated to APIs by the end of 2021. Meanwhile, FDX is a nonprofit whose aim is to develop a “common, interoperable and royalty-free standard for the secure access of user permissioned financial data,” according to its website. Envestnet | Yodlee has been pumping out data access agreements with major banks like JPMorgan Chase, Wells Fargo, Charles Schwab and others.

Speaking of regulation, how are data aggregators being regulated?

The regulation of screen scraping and APIs is, well, sparse to say the least. And from our interviews, it sounds like data aggregators want to be regulated.

Costello explained that regulators have been studying this issue for a while to really understand “just how complex of a beast it is.” There’s an intersection between privacy, credit reporting and money transmittal, he explained.

Regardless of how complicated it is, Costello said, from a governance standpoint and absent regulation, Yodlee is responsible for the conduct, character and behavior of its clients.

“We have to put in place, and have put in place, an enhanced governance program, a sort of self regulatory scheme, to make sure that our clients have willingness and the ability, the safeguards and the governance in place, to protect the consumers data. That takes some time. As we look at the Yodlee platform, and our clients today, it’s going to take us another two-and-a-half years to fully migrate to APIs.”

Regulators are also weighing in on consumer data rights. From a policy standpoint, Pitts said the biggest challenge in 2021 for open banking pertains to the Dodd-Frank Wall Street Reform and Consumer Protection Act Section 1033 rulemaking, which essentially states that consumers have the right to access their financial information.

Industry players recently provided feedback to the Consumer Financial Protection Bureau on parameters for consumer-permissioned access to financial data as part of a public comment process, according to an American Banker article.

But, the CFPB ruling on the issue is not expected until 2022. Dan Quan, a senior adviser at McKinsey and a former senior adviser at the CFPB told American Banker that “the aggregators want consumers to have access to their own data and banks want to control it.”

Analysts say banks amplify the risks of screen scraping because when data is funneled through aggregators, they may feel they’re being taken out of direct relationships with customers.

“Banks exaggerate [the security risks] because they don’t have control,” said David True, a partner at PayGility Advisors. “They’re becoming just dump repositories for money.”

The Incumbents’ Point of View

Banks argue that screen scraping is problematic because of security vulnerabilities, a lack of transparency around what data is being shared, unclear consumer consent parameters and data quality issues.

The Clearing House, a banking association and payments company owned by a consortium of large banks, said any data-sharing practice that allows a third-party app to log in as if it were the consumer needs to go.

“What’s hilarious is that if you go to some of the leading financial apps’ websites, they say ‘don’t give anyone else your login information into our app.’ But the only way their apps will work, is if someone gives them their bank login information,” said Ben Isaacson, senior vice president of product strategy at The Clearing House. “A party that [a consumer] doesn’t even know is getting the information to log into their bank, and that information is now being stored at a party that is unregulated.”

In addition, consumers often don’t know what permission they’re giving the aggregator to access their account information, he added.

With API-based data access, consumers are directed to the bank login page where they’re given indications of who their data is being shared with and what is being shared, based on API agreements, according to The Clearing House. And with a direct connection to the bank, the data quality will be better because it’s based on a live connection instead of a screen scraper that accesses account information only at certain times of the day, he said.

Asked why API-based connections aren’t more common, Isaacson said certain small banks struggle with the enabling technology, but on the flipside, aggregators are also dragging their feet in building the enabling infrastructure.

“What you’re telling them is ‘I have to do a technology replatform, I have to disrupt my current business with all my current customers,’ and you can make the argument that the current technology is good enough,” he said.

Although Goldman Sachs, PNC Bank and Dallas-based T-Bank either couldn’t provide a comment or couldn’t be reached to comment for this article, banks’ public comments mostly align with The Clearing House point of view.

For instance, Wells Fargo said in a blog post that screen scraping is a practice that’s “gradually waning” and that banks have formed data exchange agreements based on API technology with fintech apps and aggregators that could “eventually eliminate the problematic aspects of screen scraping.” Wells Fargo has more than 3 million customers connected their data to third-party apps using its API since it started engaging with data exchange agreements about four years ago. Wells Fargo had an agreement in September 2020 with Envestnet | Yodlee and with that agreement reached the milestone of having 99% of third-party financial app screen scraping covered under agreements to transition to API-based data exchange connections.

PNC said in a position paper to the CFPB last year similar sentiments to what Envestnet | Yodlee, Plaid and other data aggregators say. That screen scraping is “outdated” and that it puts sensitive customer information at risk and also limits consumers’ informed consent about their information being shared. Also, PNC says that data aggregators currently aren’t subject to any comprehensive regulatory regime. How does PNC want to mitigate this issue? By pivoting to API interfaces supported by tokenized authentication.

Overall, only time will tell as to how the CFPB and other regulators decide to enforce rules. Until then, we’re in the wild wild west of data aggregation and open banking.

Where are we going?

Among the areas of agreement between banks and aggregators, it’s unclear when APIs will become the dominant method to transfer data between banks and third parties. To be blunt, Plaid’s Policy Lead John Pitts said he doesn’t know when the tipping point for APIs as the most common data gathering method will occur. He did note for reference that the UK had a strong concerted effort backed by the government towards secure APIs and it took the UK years to do so.

What the future will behold is unclear, but an open banking software provider and aggregator Finicity CEO Steve Smith told FinLedger that the success of APIs and open banking hinges on a couple of factors, including consumer control, access to their data, whether it’s highly secure, among other factors.

FinLedger reported in October that open banking gives customers the option to “interact with their accounts with whatever tool and channel they like,” said George Anderson, founder of Ninth Wave, a New York-based secure data connectivity provider between financial institutions and third-party applications.

Smith thinks that down the road, we’re going to enter a new world where the consumer is the center of their data.

“[Eventually] the consumer [will have] the ability to access and control the distribution of data for their betterment,” he said. “They can decide to use it in digital underwriting, and that digital application, they can digitally verify assets, income employment, they can quickly add new accounts, they can quickly fund new accounts, they can handle payments in a variety of different ways. All of this technology and innovation [is] based around the notion of the consumer being at the center of their data and being able to access and control it.”

Meanwhile, Costello said that we need to evolve from open banking to open finance. This is a sort-of financial world utopia where consumers and seamlessly “turn on” and “turn off” access to their data for a variety of use cases – both financial as well as non-financial.

“An open finance mindset brings the customer or gives the customer the ability to permission all of their data to service providers that are well suited and responsible in using that data to help with better lifestyle choices,” Costello said.

With some help, regulators can help get us there. According to Costello, the CFPB needs to issue some “wholesome rulemaking to level the playing field for all of the stakeholders, which includes harmonization of existing regulations.”

“We need an accreditation program for aggregators and fintechs alike,” Costello said. “On the back of that we need to build a liability framework, so that we can have a sort of a financial Hippocratic oath or a Fintech Hippocratic oath.”

Latest Articles

Content from our partners

Log In

Forgot Password?

Don't have an account? Please


Forgot Password

Please enter your registered email address below to receive a password reset link.

Check Your Email

A password reset email has been sent to the email address on file for your account, but may take several minutes to show up in your inbox. Please wait at least 10 minutes before attempting another reset.

Welcome to FinAssist

Go to your inbox and open 'Welcome to FinAssist, your company discovery platform' to get started. You may also skip your inbox and 'Start tutorial'.