Dealing with dodgy markup

Lately I've been working on scraping various parliamentary websites to collect all the data I need for the YVIH API. I might write up how I've gone about this process in more detail in another post as there's plenty to discuss. However today's post is about dealing with dodgy markup.…