Just an aside: to properly analyse any such data you will need to first have a good understanding of the PDF specification, ISO 32000-1. People often look for things which are not actually in existence, like font style/weight, underlining as an attribute, text flow, headers/footers etc.
↧