A lot of work went into the Platform this weekend, thanks in part to my new contributor Jared Williams. While I wound up writing the majority of the code, conversations with him greatly helped clarify the goals and structure of the platform.
One of the most important things to come out of this was a separation of the flags from the document into a separate "schema" structure that dictates the interpretation of the input document's data structure/values via attribute flags on the nodes. These flags are then applied to the corresponding nodes in the input documents to produce SOLR output documents and their corresponding XML serializations.
In the next few days we'll be generating a large sample input set, corresponding schemas, and a front end API to deliver results. This will allow us to put the platform to the test and give me something to push up as a demo on the ec2 server I've set up.
In the mean time, example documents have been included below:
Example Schema:
Example Document:
Example Output:
No comments:
Post a Comment