When Adobe introduced the Portable Document Format (PDF) in 1993, a Gartner consultant called it “the dumbest idea I’ve ever heard.” At the time, users had to wait for large files to download and open on their computers over dial-up Internet. The company’s board even wanted to shut down the project. But PDF became widespread, especially after the U.S. Internal Revenue Service (IRS) began using it for digital tax forms. Today, there are more than 2.5 trillion PDFs in the world.
However, PDF’s shortcomings remain. It’s difficult to read on a smartphone, it’s inconvenient to transfer data, and reading programs for the visually impaired struggle with PDFs. Since Adobe relinquished control in 2008, the format has also become a vehicle for malware—according to Check Point, one in five email-based cyberattacks are carried out via PDF attachments.
Recently, another problem has emerged. Large language models (LLMs) that support generative AI have difficulty “reading” PDFs correctly, parsing multi-column pages in the wrong order, which sometimes causes chatbots to “fabricate” incorrect information.
However, some startups – such as Factify – are aiming to create a new file format that could replace PDF. On the other hand, Duff Johnson, head of the PDF Association, believes that the problem is not the format itself, but the way it is used. Adobe and Google are already improving their AI tools to work better with PDFs. It seems that PDF’s reign may continue.
