Luke 15 Basic Errors

home
back
 

In reviewing the Luke 15 study in its development of a complete p-value analysis for Theomatics, we noted six errors made by the author that significantly impact the final conclusion in this analysis.
 

Redundant Hit Phrase

The author claims 46 unique and different hits: claim (1) on p.27. However, hit 46 is redundant, using the same exact phrase (SOU OUTOS) as hit 41.

In this connection, the author states (emphases his), "Relative to the 'miscount' on #41 and #46, I am trying to recollect why I did that (it has been some time). Even though the words are redundant, in this particular instance 'sou autos' is USED IN TWO COMPLETELY DIFFERENT CONTEXTS.  One (verse 30) is in reference to 'this thy SON,'  and the other to 'this thy BROTHER.'  So both are not redundant in that one points SPECIFICALLY to a son and the other SPECIFICALLY to a brother. I don't believe there is any other instance in Luke 15 where redundant words are in two completely different contexts. So in that regard it would be VALID to present the same Greek words twice, each as an ORIGINAL hit.  This I would refer to as a 'line call.'"

Clearly, referring to this as a "line call" implies that there is something subjective in the concept of the author's requirement that all phrases be "unique and different." One would naturally interpret such a standard to mean that the same phrase may not be counted more than once, regardless of the particular context of the phrase. The author supports this interpretation in T&SM, p. 4-6: (emphasis his) " Each original phrase can only be counted one time. Take for example the word 'THE-SON' (o uioj), which has a  value of 150 X 5. It appears in many passages, but can only be counted one time, from one passage. Obviously, it would be absurd from a statistical standpoint to take 10 passages that repeat the same spelling for the word 'SON,' and say that we had found 10 significant features. Each passage must exhibit a unique and different pattern."

The tenor of these remarks indicate that multiple counts of a phrase should be excluded regardless whether the phrase may appear to have slightly different meanings in different instances, as in whether a reference to "son" refers to the same exact person in each instance or not. The idea is to prevent "stacking" of the statistical results by not allowing identical phrases to produce multiple hits. The spirit of the principle apparently has nothing whatsoever to do with the meaning of the phrase or to what it refers in the context. The author respects this interpretation in that he does not count multiple hits of the word AUTOU, which is divisible by 90, even though it refers to different persons in different locations: in verse 13 it refers to the prodigal son, and in verse 28 it refers to the elder brother. The author (correctly) only includes the hit from verse 13.

In our opinion, the elaboration by the author, as well as his practice (with AUTOU), is sufficiently clear: the author was being purposefully arbitrary and inconsistently subjective in this instance by including "sou autous" as a multiple hit. By his own definition, by his example, and in the context of the "conservative" spirit of this analysis, there are only 45 unique and different hits in his published results. In order to minimize the inevitable present criticism, the author should have made this "line call" the other way... playing "safe." At the very least, he should have stated clearly in the analysis that he was doing otherwise.


Redundant Hit Sums

The author made a second error in violating his stated methodology by including phrases in his sample having equivalent phrase sums. We have located 77 such phrases, and presume that most (if not all) of them were retained by the author in his phrase pool. Four of these redundant phrases were hits, the second reference in each pair of matching sums below.

No

Reference

Sum

W

Phrase

180

LUKE 15:17

808

1

EGW

 

LUKE 15:25

808

2

HN O UIOJ

207

LUKE 15:17

2250

4

EAUTON ELQWN EIPE POSOI

 

LUKE 15:28

2250

2

OUN O PATHR AUTOU

385

LUKE 15:24

1260

2

O UIOJ MOU

 

LUKE 15:27

1260

1

OTI O ADELFOJ

399

LUKE 15:24

810

4

NEKROJ HN KAI ANEZHSE

 

LUKE 15:27

810

1

ADELFOJ

All of these redundant phrases are short, reducing the author's WLA inappropriately.


Phrase Rule Violation

A third error is another indiscretion in judgment that violates the spirit of his "conservative analysis," where the author subjectively deviates from established Theomatic principles in order to increase the number of successes in the experiment. The author's 37th hit is a direct hit of 1-word though it ends in two conjunctions (so it is actually a 3-word phrase, but the author does not count de kai as base words since they are conjunctions. Here he violates his own phrase construction rules listed on p. 4-42, T&SM, which indicate that only beginning conjunctions should not be counted as base words).

Concerning this the author states (p.26): "Normally, Theomatics never considers phrases that end with a conjunction. Yet some instances do carry  a clear significance. The above is not a typical case of one long phrase connecting to another long phrase, but simply carries over the emphasis of the brother being angry and refusing to enter the celebration."

It is unclear exactly what the author means by this remark, or why the apparent connection of the latter phrase to the first implies "clear significance" such that the first phrase should be allowed to end in a conjunction and violate general Theomatic procedure. It raises the question as to why the author ever even considered this phrase in his sample, since it clearly deviates from Theomatic practice and since he claims that all phrases were initially constructed with absolutely no prior knowledge of whether they would produce successes or not. Why would he just "happen" to want to violate a normal rule in Theomatics to include such an awkward phrase, and presume the above explanation for so doing, unless he did so knowing that it was a success? Evidently, after years of experience in this practice, he is very good at seeing these patterns with his eyes, which he does freely admit. It is clearly an awkward position for him to maintain in this "most conservative" analysis of Theomatic design.

Again, the nature of this experiment being as it is, to demonstrate clear Theomatic significance "in the most conservative possible manner," it is imperative that such "discretionary calls" be made "conservatively." We therefore consider the inclusion of this hit an error. Therefore, of the author's claimed 46 hits, only 44 are valid.


Hit Omissions

An addition to including two hits that he should not have included, a fourth error made by the author is his failure to locate 13 other Theomatic hits in the story. The following phrases can be constructed in a manner consistent with all of the other phrases the author has constructed. They each contain four primary words or less (discounting variables and beginning conjunctions), they all contain a reference to one of the sons as noted by the author (bounded by >angled brackets<), and all are taken directly from the text with words in exact juxtaposition. In the table below, the reference is given first, followed by the phrase sum (S), the other factor (besides 90), the cluster radius (R), the phrase length in primary words (L ), and the phrase itself.

 

Reference

S

F

R

L

Phrase

1

LUKE 15:14

2701

30

1

4

XWRAN EKEINHN KAI>AUTOJ    

2

LUKE 15:14

3059

34

-1

4

THN XWRAN EKEINHN KAI>AUTOJ

3

LUKE 15:16

2608

29

-2

3

AUTOU<APO KERATIWN         

4

LUKE 15:17

2068

23

-2

3

ARTWN DE>EGW               

5

LUKE 15:20

3152

35

2

4

AUTOU KAI KATEFILHSEN>AUTON

6

LUKE 15:20

1982

22

2

4

PATERA>AUTOU<DE ETI        

7

LUKE 15:20

2162

24

2

4

ANASTAJ>HLQE<PROJ TON PATERA

8

LUKE 15:25

2788

31

-2

4

WJ ERXOMENOJ>HGGISE<TH OIKIA

9

LUKE 15:25

3508

39

-2

4

O UIOJ AUTOU O >PRESBUTEROJ<EN                      

10

LUKE 15:26

2248

25

-2

4

PAIDWN>EPUNQANETO<TI EIH   

11

LUKE 15:29

3778

42

-2

4

IDOU TOSAUTA ETH>DOULEUW   

12

LUKE 15:29

1438

16

-2

4

PARHLQON KAI>EMOI<OUDEPOTE

13

LUKE 15:30

3868

43

-2

4

HLQEN EQUSAJ>AUTW<TON MOSXON

In the hits produced by 90 with the random seed 666 given in section 10, the author conveniently lists the second phrase above, evidence that this particular phrase actually was in his sample (so we presume that the other phrases were as well). The author does not indicate that this phrase is non-sensical as he has so many other phrases in this analysis. Further, in Definition #6 (p.21) the author states that all such phrases should be included: "The computer is capable of calculating every single phrase combination possible. So for this investigation, all mathematical possibilities will be extracted."

We are at a loss to explain his omission of these results in his analysis, other than the interesting fact that their existence does significantly weaken the significance of his result. Of these 13 instances (verily, a strikingly significant number:), 11 are 4-word phrases (affecting WLA), and 11 belong to a cluster of 2 (affecting clustering). The omission of these Theomatic instances profoundly affects the author's ultimate conclusion. Regardless, obviously, they should all have been included.


Affect on WLA

These four errors affect the Word Length Average (WLA) of the hits used to determine average phrase length in step (5), p. 27. The WLA for Theomatics published by the author is 109/46=2.370. The correct WLA for the 4-word phrases is 150/53=2.8302. The 3-word and 2-word WLA's are similarly affected.

L

Old Words

Old Hits

Old WLA

New Words

New Hits

New WLA

4

109 

46

2.370

150 

53

2.830

3

81 

39

2.077

78 

35

2.228

2

39 

25

1.560

30 

19

1.579


 

Reference Omission

A fifrth error is the omission of a clear reference to one of the sons in verse 14: AUTOU. This word is translated "he" in the KJV, "and when he had spent all," and is a clear reference to the prodigal. It produces nine additional phrases in the phrase pool, 3 of each word length of 2 or more (this word has already been used in verse 13 as a single-word phrase so is not counted here again). It produces no hits and so weakens the probabilities of the hit results.
 

Affect on Sample Size

The above five errors affect the author's adjustment to the sample size (p.34) to achieve a WLA for the total sample equivalent to the WLA of the successes. In this procedure (for which, incidentally, the author offers no justification whatsoever) the author clearly takes an  unconventional liberty with the data: he reduces the sample space on presumption of adequate theoretical knowledge of Theomatic behavior. He requires that the WLA of the sample be no larger than the WLA of the hits obtained from it. We consider the validity of this maneuver separately.

For the 4-word phrases, which he claims has a hit WLA of 2.370, the author uses the WLA for the 3-word sample as a base, and calculates it (below) as 1127/467=2.413. The ratio derived from correcting the 1st, 2nd, 3rd, 4th and 5th errors is actually now 995/412 so the correct base WLA for the 3-word phrases is 2.415. To match the hit WLA the author deducts 33 phrases from the base of 467 to get the 2.413 sample WLA down to 2.369, just below the WLA of the 4-word hits, giving him a sample size of 467-33 = 434. Since the actual WLA of the 4-word phrases is 2.830, the correct calculation is to add 146 4-word phrases to get the 1127/467 = 2.418 WLA up to (995+584=1779)/(412+146=558)=2.830. This gives us a sample size of 558 for the 4-word phrases.

The 3-word phrases require similar adjustment. His calculations are (p.47) 1127/467=2.413. To meet the 3-word WLA of 2.077 he deducts 170 phrases to get (1127-510)/(467-170)=2.077 resulting in a  sample size of 297. The correct calculation is (995-300)/(412-100)=2.228, giving us a sample of 412-100=312.

The 2-word phrases require similar adjustment. His calculations are (p.48) 404/226=1.788. To meet the 2-word WLA of 1.560 he deducts 117 phrases to get (404-234)/(226-117)=1.560 and a sample size of 226-117=109. The correct calculation is (344-172)/(195-86) =1.579, also yielding a sample of 195-86=109.

L

Old Words

Old Phrases

Old Base

Old Size

New Words

New Phrases

New Base

New Size

4

1127

467

2.413

434

995

412

2.415

558

3

1127

467

2.413

297

995

412

2.228

312

2

404

226

1.788

109

407

195

1.579

109

This corrected sample size then affects the p-value calculation. The published p-value for the 4-word phrases is for 46 hits in 434 phrases, which is .00002606, and published odds for this result are 1:38,273 (p. 38, evidently a typo, actually is 1:38,373 = 1/.00002606). The correct p-value is for 57 hits in 590 phrases, being .0000460 with odds of 1:21,731. Similar adjustments are made for the 3-word and 2-word phrases.

L

Old Hits

Old Size

Old p-value

Old Odds

New Hits

New  Size

New p-value

New Odds

4

46

434

2.6098E-05

38,318

53

558

1.2678E-04

7,888

3

39

297

6.8727E-07

1,455,039

35

312

7.3054E-05

13,688

2

25

109

1.1801E-09

847,361,249

19

109

8.8862E-06

112,533


 

Cluster Statistics

A sixth error is noted, where the author incorrectly places hit #20 from a 4-word phrase into the cluster of 1. It is actually a direct hit, which one can easily verify from the author's notes on p.16. Either something is wrong with the computer program generating his results, or a good bit of this work was done by hand, introducing human error into his findings.

The first five errors have a corresponding affect on the p-value of the cluster analysis. From the first and third errors, the redundant 2-word and ending-conjuction hits were published as direct hits, implying that the cluster distribution was actually 14:21:9 instead of 16:21:9. The second error included three direct hits and one of radius 2, implying the corrected distribution is 11:21:8. Hit #20 was counted as a 1 when it should have been a 0, implying that the cluster distribution is actually 12:20:8. 13 hits were omitted from the analysis, 11 of which were of cluster 2, and 2 of cluster 1, giving us 12:22:19.

This gives a Chi Square p-value of .8012 with a probability of 1:1.2 instead of the published .00611 with odds of 1:164. The 3-word and 2-word cluster analysis are also affected.

L

Old 0

Old 1

Old 2

Old p-value

Old Odds

New 0

New 1

New 2

New p-value

New Odds

4

16

21

9

.00611

164

12

22

19

.80116

1.2

3

15

16

8

.00563

178

10

16

9

.18664

5.4

2

11

10

4

.00452

221

6

10

3

.09000

11.1

Correcting all six errors affects the final calculated probability of both the hit and cluster events occurring for each phrase length. The published values for the 4-word phrases are .00002606 x .00611 = .00000015923 with odds of 1:6,280,224 (p. 44). The correct values are .00012678 x .80116 = .00010157 with odds of 1:9845. The 3-word and 2-word calculations require similar adjustment.

L

Old hit p-value

Old CS p-value

Old Final p-value

Old Odds

New hit p-value

New CS p-value

New Final p-value

New Odds

4

2.6098E-05

.00611

1.5923E-07

6,271,293

1.2678E-04

.80116

1.0157E-4

9,845

3

6.8727E-07

.00563

3.8699E-09

258,402,863

7.3054E-05

.18664

1.3635E-5

73,341

2

1.1801E-09

.00452

5.3302E-12

1.8761E+11

8.8862E-068

.09001

7.9985E-7

1,250,239

The author has inflated results 1,000-fold for 4-word phrases, 10,000-fold for 3-word phrases and 100,000-fold for 2-word phrases. While these adjustments may not materially affect one's conclusion at this point in the analysis (though we do not find such results impressive since sample randomness cannot be assumed with credibility... even this if were a Theomatic event -- this one is imagined!), they do eventually have a clear and significant bearing in the final analysis.

home
back
top