Hacker News new | past | comments | ask | show | jobs | submit login

Thanks for this suggestion! We'll definitely kick Blacklight around as an option. We have a complex data model (cases, opinions, parties, citations, courts, jurisdictions, volumes ...), and tens of millions of pages, and weird access restrictions, and multiple output formats (text/html/xml/pdf), so whatever we end up with will have to be pretty custom.



I worked for the British Medical Journal a while ago which had 120 years of articles digitised, along with several different medical databases, with around a dozen different applications utilising this content in various ways. It was basically all stored internally as XML using the Documentum/xDB CMS/XML database stack, with lots of XSLT used to generate content for different apps and in different formats.

Not at all the easiest system to work with, but it does allow you to handle a lot of structured data such that you need the minimum amount of customisation possible - which is still a lot!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: