Website
Classification
   
   
 






 
A system of automated website classification was developed for a client needing to automatically decide whether specific websites were appropriate places to advertise.

The analysis first required a generalized way to look for template-based websites. A meta-language was developed to do high-level pattern matching on the structure of websites.

Classification was then done using latent semantic analysis on specific parsed sections of matching websites.