# Sunday, March 28, 2010

BibleTech 2010 was a blast. I got to hang out with very smart, very fun people and spend all my brain cycles thinking and pondering about the intersection of Bible and technology.

Apart from simply hanging out with fun folks like James Tauber and Mike Aubrey (to name a few), the highlight for me had to be Neil Rees’ presentation on, essentially, bootstrapping a concordance as a model to create a new concordance. If you don’t care about stopwords and homographs, sure, you can just write a program. But most programs are that do such tasks aren’t too good and require a lot of human post-processing — particularly if you want a smaller, non-exhaustive concordance. Rees presented on using existing, well-edited concordances (in any language) as models of concepts to include in a new concordance of a new text. This is brilliant.

I know some presentations are making their way to Vimeo. There are three I can recommend. First, two from James Tauber:

Also check Weston Ruter’s presentation on the Open Scriptures API:

Note that A portion of James’ 2008 BibleTech presentation dealt with the graded reader, a video describing it is below. Very cool stuff.

Post Author: rico
Sunday, March 28, 2010 7:12:50 PM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [0]
# Sunday, March 29, 2009

Here’s the paper I presented at BibleTech:2009:

Stylometry and the Septuagint: Applying Anthony Kenny’s Stylometric Study to the LXX

In 1986, Anthony Kenny wrote a book called A Stylometric Study of the New Testament which gives details for compiling and comparing book-by-book stylometric statistics for the Greek New Testament given a morphologically tagged corpus. This exploratory study proposes to apply Kenny's method to the LXX, using the Logos Bible Software LXX Morphology, to analyze style.

While Kenny's primary application of his method was in the area of authorship studies, this paper is more interested in the general style of the LXX, and not at all interested in authorship theories or assigning a 'hand' to different passages. For better or worse, this paper treats the LXX as a corpus, and has little interest in its relationship with the underlying Hebrew text.

Once the analysis has been detailed, some points of interest (known only when the analysis is complete as the nature of the study is exploratory) will be further explored. Areas in which the work could be further developed will also be reviewed.

If you actually read it, and then actually have feedback, then please let me know what you think.

In a nutshell, after looking at book-level and chapter-level distributions of part of speech, case/number/gender, tense/voice/mood; I have a worked example of future tense in Leviticus (and then in the Pentateuch). My conclusion: In the Pentateuch, anyway, future tense verbs appear in concentrated groups. The application is when you read or work through these works, then, you should pay attention to the clustering of the future tense to determine what is going on (law-giving, prophetic stuff, whatever). And, if you run across an isolated instance of the future tense, you should pay double attention to that because it is not normal.

At some point in the future, the audio from the talk will be on the BibleTech website.

Post Author: rico
Sunday, March 29, 2009 10:22:57 AM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [0]
# Saturday, March 28, 2009

It's Saturday, day 2 of BibleTech:2009. My paper (Stylometry and the LXX) is on at 3:00 this afternoon (Room 1 if you're here). I'll post the actual paper later (probably Sunday).

Yesterday was excellent. Intelligent people doing some pretty awesome stuff. The highlights:

The best parts, though, are the in-between times. At BibleTech, the meals are included, so you can get in conversation with folks who you run into (everyone here is doing impressive stuff, not just the people presenting) and learn more about their projects.

Gotta go before the laptop battery dies.

Post Author: rico
Saturday, March 28, 2009 7:27:46 AM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [0]
# Thursday, March 26, 2009

BibleTech:2009 starts tomorrow (Friday) AM, and I’m ready. My paper is written, I have a reading copy (yes, I’ll be reading it) and I have PowerPoint ready to go too. My presentation is Saturday afternoon from 3:00-3:45 in Room 1. The title of the paper is “Stylometry in the Septuagint: Applying Anthony Kenny’s Stylometric Study to the LXX”. I’ll post a copy of the paper to may academic papers page sometime after the conference. Check the schedule page for more info.

Some folks will be live-blogging the conference, others will be twittering to their heart’s content, I’m sure, but I won’t be doing any of that. Perhaps a post on Friday evening sometime, but maybe not even that. Or maybe a post on Sunday after the whole thing is done; we’ll see.

Looking forward to it! If you’ll be there, make sure to catch up with me during a meal — I’d love to talk more about you with whatever sorts of Bible-techie stuff you’re working on or considering!

Post Author: rico
Thursday, March 26, 2009 12:46:18 PM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [0]
# Thursday, February 05, 2009

Jim West will be aghast, but I'm going to quote Wikipedia.

Doing simple searches on "stylometry" landed me on the Wikipedia page, which has the following (and yes, Jim, the quotation is footnoted). The bold portion is the money quote:

The development of computers and their capacities for analyzing large quantities of data enhanced this type of effort by orders of magnitude. The great capacity of computers for data analysis, however, did not guarantee quality output. In the early 1960s, Rev. A. Q. Morton produced a computer analysis of the fourteen Epistles of the New Testament attributed to St. Paul, which showed that six different authors had written that body of work. A check of his method, applied to the works of James Joyce, gave the result that Ulysses was written by five separate individuals, none of whom had any part in A Portrait of the Artist as a Young Man.

All the more reason, at least for me, for being interested in stylometry as a better understanding of style, and learning more about how authors communicate. If your primary or even only interest in studying style is authorship attribution ... well ... you'll be disappointed.

Also: In many extended discussions on the authorship of the Pastorals, you'll run across Morton's name and work. Now, I'm not saying it's all bogus, there is important stuff in there about style. But discerning particular attributes of "style" (particularly through counting) does not mean one has discerned authorship. Of this, beware.

Post Author: rico
Thursday, February 05, 2009 7:52:29 AM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [0]
# Thursday, January 22, 2009

Any ideas as to what this might be?

Hint: It has to do with my BibleTech:2009 paper. (If you're using a feed reader like Bloglines, you'll need to see the post on ricoblog for the details)

                   
                   
                   
                   
                   

Any ideas?

Ok, I'll give. The above is a representation of parts of speech in the first five books of the LXX (so, the pentateuch). Yes, lots of refining to do, but you get the gist. The order is:

Noun

Adj

Prn

Art

Vb

Cj

Adv

Ptcl

Intj

Indcl

 

Post Author: rico
Thursday, January 22, 2009 8:08:07 AM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [7]
# Tuesday, January 20, 2009
Click here to learn more about BibleTech:2009!

(particularly if you're in the Pacific Northwest / British Columbia region)

Because cool people are doing cool things. Mike Aubrey, a very smart and very motivated guy, is playing around with automated morphological tagging using some of SIL's existing tools. And he's giving a paper on it. (Go here for more info, then go here to register for BibleTech:2009)

If this sort of thing floats your boat, not only will you be able to hear the paper—you'll be able to sit down at a meal with Mike and talk with him more about it.

And that's what I like about BibleTech. Sure, there is learning new stuff from folks doing cool things. But there is also a sense of community where you can actually talk further (outside of a formal Q&A session) about stuff and get to know someone.

So consider attending, and please do register!

Post Author: rico
Tuesday, January 20, 2009 11:41:05 AM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [1]
# Monday, January 19, 2009

Sound interesting? Then you should come to BibleTech:2009, which is to be held in Seattle on March 27 and 28. Logos just pushed a press release with more info.

If you're in the Seattle area or the Northwest, then you should register for BibleTech:2009, come on down and hang out with us. Note that registration includes sessions and catered meals. The meals were one of the best parts of last year's conference. Too often at conferences there is too much hustle-and-bustle and not enough time actually interacting with the interesting and smart folks there. The meal times allow for that, and it's pretty cool.

Here's the text of the press release, for more information.

BELLINGHAM, WA–January 2, 2009–Scholars, publishers and technologists will be in attendance at the second-annual BibleTech conference in Seattle, WA on March 27 and 28.

BibleTech:2009 will feature more than twenty-five presentations from leading publishers, software developers, and web developers. Topics include data standards, the semantic web, mobile computing, ancient languages, and integrating technology into the Bible classroom.

“BibleTech is a place for everyone interested in the Bible and technology. There is no other conference where publishers, academics, ministry leaders, and technologists can find so much common ground,” said Bob Pritchett, President of Logos Bible Software.

BibleTech:2009 will feature two tracks. The first will address the technical aspects of programming, designing, and publishing software for Bible study and ministerial applications. The second track will focus on the application and implementation of Bible-based technologies including sermon preparation and advanced computer-based research strategies.

Featured presenters include: Mark Stephenson, Director of Web-Empowered Church; Lance Ford, Co-founder of Shapevine.com and WebChurchMedia; and Ellen Frankel, CEO and Editor-in-Chief of the Jewish Publication Society. A complete list of 2009 conference speakers is available at www.bibletechconference.com/speakers.htm.

More information is available at www.bibletechconference.com

What are you waiting for? Sign up, and come see us!

Post Author: rico
Monday, January 19, 2009 10:03:54 AM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [1]
# Monday, January 12, 2009

I've mentioned the upcoming Bible Technologies Conference and the paper I plan on presenting there (also info here). I've recently realized that I've got a little more than two months to get the durn thing written.

I also realized that Kenny spent 124 pages talking about Stylometry in the New Testament; I'm giving a paper that is allotted perhaps 30 minutes (some portion of which is intended for questions) for a corpus that is roughly four times the size of the New Testament.

In other words, I'm realizing that I'll have to give a very high level overview with perhaps some glimpses at deeper-level data. Chances are I'll follow most of Kenny's lead, which means:

  • Rough overview of distribution of major parts of speech (nouns, adjectives, verbs, adverbs, conjunctions, prepositions, etc.)
  • Rough overview of most common words and their distribution/frequency
  • Perhaps some further look at things like conjunctions and articles

Kenny then used portions of his data in the evaluation of certain textual issues, mostly geared toward authorship (Luke/Acts, John/epistles/Revelation, Paulines). I'll have to determine an issue to examine further using the data pulled together, but I have some constraints:

  • No examination of JEDP, whatsoever.
  • No examination of authorship, whatsoever.
  • No examination of translational theory, whatsoever.

Given these constraints, are there stylistic issues in the LXX that you would suggest I use for my example case study?

My own thoughts have to do with genre (say, look at stuff having to do with narrative versus stuff having to do with poetry to see if there are any sorts of things that seem to be indicative of one or the other). But I'm interested in what you might think or suggest. For an idea of the criteria/features I'm tracking, see this post.

Please feel free to leave a comment with your suggestion(s), or drop me an email (textgeek at gmail dot com). Thanks!

Post Author: rico
Monday, January 12, 2009 4:35:53 PM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [1]
# Wednesday, November 19, 2008

[NB: I'll be blogging random things about my upcoming BibleTech:2009 paper; these posts will all be available in the "bibletech" category. If you're presenting a paper at the conference, might I suggest the same practice? That way they'll all be available by a search for 'bibletech' on Technorati or some other such service. —RB]

In his book A Stylometric Study of the New Testament (amazon.com), Anthony Kenny lists 99 features that he tracked across the corpus, using them as a guide to his analysis. His feature list is based on categorization of the Friberg morphology circa 1986. I believe the Friberg has undergone significant revision since then and is considered to be in at least its second edition; perhaps even the third edition. Kenny also includes some stock lexical items such as conjunction instances, preposition instances, and some specific words (e.g. θεος, λεγω). Note that Kenny did all of his counts by hand, from "the microfiche concordance ot the machine-readable version of the Analytical Greek New Testament"! (Kenny, "Note on Sources") He used a TI 58 statistical calculator for his numbers, also "the ICL 2988 machine in the Oxford University Computer Services". (Kenny, "Note on Sources").

Right now, I'm thankful for fast computers, XML and for Perl and/or C# (haven't figured out which language I'll use for the code yet).

In my paper for BibleTech:2009, I'm proposing to carry out a similar analysis, only of the LXX, using the Logos Morphology. There are several of the 99 categories that can be re-used (81, to be exact). Friberg has much more going on in adjectives, adverbs, and conjunctions than the Logos LXX Morphology; this accounts for much of the difference.

However, I think I'll be able to track up to 106 features, and perhaps more. How? Kenny did very little with participles, and even less with pronouns. I have no idea why he did little with participles because the Friberg morphology is rich in this area (even differentiating, at the 'mood' slot, between 'participle' and 'participle (imperative sense)', Kenny pp. 10-11, figure 2.1). It may have been because it would be too tedious (recall he counted by hand). But pronouns are simply a type of noun in this edition of Friberg, so Kenny's hands were tied (he tracked third-person pronouns in sum and also by case, but that's it).

Kenny also didn't track instances of the vocative case (for articles, nouns, and adjectives). But he did track optatives and pluperfects (indeed rare cases in the NT). Thus, to the 81 shared criteria, I'm considering adding 25 more for a total of 106.

If you're interested, the list of 25 additional features is below.

Because of differences in classification of voice
82. Number of occurrences of third-person singular indicative verbs in the either-middle-or-passive voice

Participles
83. Number of occurrences of verbs in the participle mood
84. Number of occurrences of participles in the nominative
85. Number of occurrences of participles in the dative
86. Number of occurrences of participles in the genitive
87. Number of occurrences of participles in the accusative
88. Number of occurrences of participles in the masculine
89. Number of occurrences of participles in the feminine
90. Number of occurrences of participles in the neuter
91. Number of occurrences of participles in the singular
92. Number of occurrences of participles in the plural
93. Number of occurrences of Proper Nouns

Interjections
94. Number of occurrences of Interjections (I)

Vocatives
95. Number of occurrences of vocative articles
96. Number of occurrences of vocative nouns
97. Number of occurrences of vocative adjectives

Other Pronoun Information
98. Number of occurrences of Relative Pronouns
99. Number of occurrences of Reciprocal Pronouns
100. Number of occurrences of Demonstrative Pronouns
101. Number of occurrences of Correlative Pronouns
102. Number of occurrences of Interrogative Pronouns
103. Number of occurrences of Indefinite Pronouns
104. Number of occurrences of Reflexive Pronouns
105. Number of occurrences of Possessive Pronouns
106. Number of occurrences of Personal Pronouns

There may be more, I just have to think about it a bit more. For instance, I could add case-specific instances of each pronoun type (so, relative pronouns in the nominative, in the genitive, etc.) but at present I'm thinking that's overkill. Of course, I may change my mind. I also need to consider if there are particular word instances to include in the feature list; I may have to do some word frequency analysis in order to determine candidates. I will also have to review LXX-specific conjunctions and prepositions to determine how those portions of the list might be expanded.

Post Author: rico
Wednesday, November 19, 2008 3:15:49 PM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [0]
# Wednesday, October 29, 2008

Here's what I've submitted for BibleTech:2009. We'll see whether or not the paper is accepted.

Note that the deadline for submissions is Nov 3, 2008. Last year was a blast, so I'm pumped for this year's conference (on March 27-28, 2009 in Seattle, WA).

Stylometry and the Septuagint: Applying Anthony Kenny's Stylometric Study of the NT to the LXX

In 1986, Anthony Kenny wrote a book called "A Stylometric Study of the New Testament" which gives details for compiling and comparing book-by-book stylometric statistics for the Greek New Testament given a morphologically tagged corpus. This exploratory study proposes to apply Kenny's method to the LXX, using the Logos Bible Software LXX Morphology, to analyze style.

While Kenny's primary application of his method was in the area of authorship studies, this paper is more interested in the general style of the LXX, and not at all interested in authorship theories or assigning a 'hand' to different passages. For better or worse, this paper treats the LXX as a corpus, and has little interest in its relationship with the underlying Hebrew text.

Once the analysis has been detailed, some points of interest (known only when the analysis is complete as the nature of the study is exploratory) will be further explored. Areas in which the work could be further developed will also be reviewed.

I should stress again, the key word is exploratory, particularly since I'll be using a beta/in-development form of the Logos LXX Morphology. I don't have theories I'm trying to prove, I'm interested in seeing what sorts of information comes to light when applying a Kenny-esque technique to the analysis of the style of the LXX.

Will you be at BibleTech:2009? You really should, last year's was one of the best, most fun conferences I've ever been to. And, if your paper proposal is accepted, you don't have to pay the conference registration! How cool is that?

Post Author: rico
Wednesday, October 29, 2008 9:30:43 AM (Pacific Standard Time, UTC-08:00) 

#     |  Disclaimer  |  Comments [6]