Extracting Semi-Structured Data from PDFs on a Large Scalegithub.com/janedoesrepo1 pointcaptndata8 years ago