Wednesday, July 06, 2005

Mark Goodacre (NT Gateway Weblog) responds to a question on deinde from Danny Zacharias regarding Scripture indexing of books.

Danny, if you're using Word (or some other word processor) to edit the sorted data exported from Excel, you can try your hand at MSWord's "wildcard" matching to turn "Ge{tab}1{tab}1-3" into "Ge 1.1-3". You can use metacharacters like '^t' to match invisible stuff like tabs, and replace everything at once instead of the tedious hand-hacking of the lines. I just played around with this and forgot how much I dislike Word's "Wildcard" or "Pattern Matching" capability. Anyway, if you search the help for "wildcard" you'll find some scant documentation, but assuming input like:

Ge{tab}1{tab}1-3
Ge{tab}2{tab}3

Where {tab} is an actual tab character. Assuming that, you can get text like:

Ge 1:1-3
Ge 2:3

With "wildcards" like this and the "Use Wildcards" box checked:

Find What: (<[a-zA-Z0-9 ]@>)^t([0-9]@)^t([0-9-]@)^13
Replace With: \1 \2:\3^p

This assumes that the second field only ever contains numbers, and the third field is only ever numbers and the '-' character. You may need to modify if your data has other requirements.

With all of that said time for the tangent/self-promotion:

Over on my single-topic blog PastoralEpistles.com, I just wrote some code that evaluates posts for cited references (hyperlink text to an online edition of the ESV at ESV.org) and generated a sorted reference index. Reference indexes are handy things, to be able to jump into blog posts (and other things like bibliography entries) based on a Scripture reference can, at times, be even handier.

On the post entry side of things, I've made it very easy to "tag" these sorts of references (i.e. {esv|1Ti 3.1-7} does this: 1Ti 3.1-7). The indexing code searches through posts, looks for particular sorts of tags that indicate a tagged reference of some sort, and compiles a list. There's more to it — one has to account for alternate forms of canonical book names in some manner. Once the list is generated, it is sorted according to a sort key (numeric string generated for sorting purposes based on the reference itself) and saved as an XML file on the server. When the index is displayed, the XML is converted into HTML and dumped to the screen in the site template. 

You can see it on the Bible Index page at PastoralEpistles.com. I see I have a small problem with the entry for 1Ti 3.1-7 duplicated; I'll have to look into that. Not quite sure what would cause that ...

 

Post Author: Rico
Wednesday, July 06, 2005 4:54:09 PM (Pacific Daylight Time, UTC-07:00) 

#     |  Disclaimer  |  Comments [0]