indus script: search for grammar 1 indus script: search for grammar 1 nisha yadav tata institute of...

43
Indus Script: Indus Script: Search for Search for Grammar Grammar 1 1 Nisha Yadav Tata Institute of Fundamental Research Collaborators: Collaborators: Mayank Vahia, Iravatham Mahadevan, Hrishikesh Joglekar ven at a two day seminar on “The Indus Script: Problems and Prospects”, Chennai

Upload: gabriel-chase

Post on 17-Dec-2015

232 views

Category:

Documents


0 download

TRANSCRIPT

Indus Script: Search for Indus Script: Search for GrammarGrammar11

Nisha YadavTata Institute of Fundamental Research

Collaborators:Collaborators:Mayank Vahia, Iravatham Mahadevan, Hrishikesh Joglekar

1 Lecture given at a two day seminar on “The Indus Script: Problems and Prospects”, Chennai

2Indus Script: Search for Grammar, Yadav et.al. (2007)

ContentsContents

1) Indus Script - An Overview2) Various Approaches3) Our Approach4) Dataset5) Preliminary Analysis6) Analysis - 1 : Check against random order7) Analysis - 2 : Positional analysis of Frequent Sign

combinations8) Text Beginners and Text Enders9) Segmentation of Indus Texts10) SummaryNote: In the lecture, unless specified otherwise, all text examples are from Mahadevan 1977 and all images are from Parpola’s UNESCO volumes of Indus seals.

3Indus Script: Search for Grammar, Yadav et.al. (2007)

1) Indus Script: An 1) Indus Script: An OverviewOverview

4Indus Script: Search for Grammar, Yadav et.al. (2007)

Indus Valley Civilization

From Mahadevan, 1977

5Indus Script: Search for Grammar, Yadav et.al. (2007)

Indus script is one of the few scripts that defy decipherment.

Inscriptions found only on small objects like seals.

The inscriptions are very brief: average length 4-5 signs.

There are only 417 signs in the script as per Mahadevan’s Concordance (1977).

The script is pictographic with signs showing human, fish etc.

Signs are modified by joining or by strokes and many signs appear as combination of other simple signs.

The direction of the script is variable (mostly right to left: 83 % of times).

In general the seals are of 1 to 2 square inches in size.

There are no bi-lingual texts to aid decipherment.

Indus Script : Pointers to understand

6Indus Script: Search for Grammar, Yadav et.al. (2007)

Direction indicators of the Direction indicators of the scriptscript

Cramping or overflow of signs at the left end Orientation of asymmetric signs Sequence of frequent combinations of signs Split sequences

A split sequence indicating direction

7Indus Script: Search for Grammar, Yadav et.al. (2007)

Scale of a typical sealScale of a typical seal

For the most part, seals are between 1 inch or 2 inches square.For the most part, seals are between 1 inch or 2 inches square.From Professor John C. Huntington’s ppt

8Indus Script: Search for Grammar, Yadav et.al. (2007)

SEAL

SEAL IMPRESSION

SEAL SEAL IMPRESSION

From Professor John C. Huntington’s ppt

9Indus Script: Search for Grammar, Yadav et.al. (2007)

Specimens of Indus Texts on different objects

From Mahadevan, 1977

Text No. Text

12Indus Script: Search for Grammar, Yadav et.al. (2007)

2) Various Approaches2) Various Approaches

13Indus Script: Search for Grammar, Yadav et.al. (2007)

Indus ScriptIndus ScriptScientists from a variety of disciplines have attemptedto read the Indus script with no clear answer.

Various attempts so far include:Various attempts so far include: I. Mahadevan’s analytical work – Creation of first

Published Concordance (1977) Gift Siromoney’s statistical work A. Parpola’s comparison with Dravidian Russian group’s comparison with Dravidian Subbarayappa’s interpretation as pure numerals S. R. Rao’s interpretation as Vedic literature Others (Ref. Possehl,1996)

14Indus Script: Search for Grammar, Yadav et.al. (2007)

3) Our Approach3) Our Approach

15Indus Script: Search for Grammar, Yadav et.al. (2007)

We make no assumption about its content or meaning.

Our first emphasis is to attempt to WRITE IN THE SCRIPT RATHER THAN READ.

We search for rules of writing without assigning meanings or interpretations.

We ignore variation due to archaeological context of sites, stratigraphy and type of objects.

16Indus Script: Search for Grammar, Yadav et.al. (2007)

4) Dataset4) Dataset

17Indus Script: Search for Grammar, Yadav et.al. (2007)

DatasetDatasetUnambiguous data subset (EBUDS) was created for analysis of the grammar of Indus writing, from the original electronic dataset of Mahadevan (1977) partially modified as M80.

EBUDS: Extended Basic Unique Dataset, excludes All ambiguous lines All texts from sides having multiple lines All duplicates (keeping their single occurrence)

Thus, EBUDS consists of 1548 lines of texts, with 7000 sign occurrences.

18Indus Script: Search for Grammar, Yadav et.al. (2007)

5) Preliminary5) Preliminary AnalysisAnalysis

19Indus Script: Search for Grammar, Yadav et.al. (2007)

Frequencyrange in M77

In M77 Present Work (EBUDS)

No. of signs Total sign occurrences

Total sign occurrences (in percent)

No. of signs Total sign occurrences

Total sign occurrences (in percent)

>1000 1 1395 10.43 1 715 10.21

999-500 1 649 4.85 1 377 5.39

499-100 31 6344 47.44 31 3230 46.14

99-50 34 2381 17.81 34 1243 17.76

49-10 86 1833 13.71 86 975 13.93

9-2 152 658 4.92 152 388 5.54

1 112 112 0.84 72 72 1.03

0 0 - - 40 - -

Total 417 13372 100.00 417 7000 100.00

Frequency distribution of Indus Signs

Only 67 (16% of total no. of signs) signs account for over 80% of Only 67 (16% of total no. of signs) signs account for over 80% of the writingthe writing..

20Indus Script: Search for Grammar, Yadav et.al. (2007)

Conclusions from Preliminary Conclusions from Preliminary AnalysisAnalysis

The frequency distribution of the signs in EBUDS is consistent with M77.

The manner of choosing the data set has not changed the pattern of occurrence of various signs and the results are consistent with the analysis of M77.

Only 67 signs (16% of total no. of signs) account for over 80% of the writing.

21Indus Script: Search for Grammar, Yadav et.al. (2007)

6) Analysis 1:6) Analysis 1:Check against Random OrderCheck against Random Order

22Indus Script: Search for Grammar, Yadav et.al. (2007)

MethodologyMethodology We take 1548 unique texts (7000 signs) present in EBUDS.

We randomise their appearance keeping the frequency of each sign as in EBUDS.

We split this long random string (of 7000 signs) into texts of 1 to 14 signs as in EBUDS.

We create 10 such random databases.

We then compare the frequency of their sign pairs, triplets etc. with Genuine Indus database (EBUDS) to check if Indus texts have any significant sequencing.

23Indus Script: Search for Grammar, Yadav et.al. (2007)

Comparison of EBUDS with Random DatasetsComparison of EBUDS with Random Datasets

No. of signs in the sign combination

Frequency of most frequent sign combination

Random Data set

EBUDS

1 2 3 4 5 6 7 8 9 10 Mean

2 60 54 62 51 57 56 63 66 58 56 58.3 168

3 5 3 3 4 3 5 7 5 5 3 4.3 34

4 1 1 1 2 1 1 2 2 1 1 1.3 16

5 1 1 1 1 1 1 1 1 1 1 1 4

6 1 1 1 1 1 1 1 1 1 1 1 2

24Indus Script: Search for Grammar, Yadav et.al. (2007)

Result of Analysis 1Result of Analysis 1Most Frequent Sign combination Frequency vs No. of signs in

the combination

0

20

40

60

80

100

120

140

160

180

1 2 3 4 5 6 7

No. of signs in the combination

Fre

qu

ency

of

mo

st f

req

uen

t co

mb

inat

ion

Random Datasets (Mean) Genuine Indus Dataset

25Indus Script: Search for Grammar, Yadav et.al. (2007)

String lengths of 2, 3 and 4 signs appear with frequency far higher than expected by random chance.

The signs are ordered in a specific manner.

It is justifiable to state that Indus texts followed certain rules and thereby meant something significant and meaningful.

Conclusions from Analysis 1Conclusions from Analysis 1

26Indus Script: Search for Grammar, Yadav et.al. (2007)

7)7) Analysis 2: Analysis 2: Positional analysis of Frequent Positional analysis of Frequent

Sign CombinationsSign Combinations

27Indus Script: Search for Grammar, Yadav et.al. (2007)

Positional Analysis of Frequent Two-sign Positional Analysis of Frequent Two-sign CombinationsCombinations

Two-sign Combination Frequency Solo (%) Left (%) Middle (%) Right (%)

99 267 168 0.60 1.79 11.90 85.71

89 336 75 0.00 0.00 89.33 10.67

176 342 59 0.00 96.61 3.39 0.00

342 8 58 1.72 72.41 25.86 0.00

99 391 56 0.00 0.00 8.93 91.07

342 347 56 0.00 89.29 10.71 0.00

1 342 48 0.00 89.58 10.42 0.00

123 293 40 0.00 0.00 0.00 100.00

59 87 39 0.00 0.00 79.49 20.51

342 48 38 2.63 52.63 28.95 15.79

59 171 36 0.00 0.00 80.56 19.44

162 249 34 0.00 0.00 85.29 14.71

211 89 34 0.00 91.18 8.82 0.00

245 245 33 0.00 60.61 21.21 18.18

211 59 31 0.00 90.32 9.68 0.00

67 65 27 0.00 0.00 74.07 25.93

130 51 27 0.00 7.41 70.37 22.22

67 99 26 0.00 0.00 100.00 0.00

342 162 25 4.00 84.00 12.00 0.00

343 123 25 0.00 0.00 100.00 0.00

28Indus Script: Search for Grammar, Yadav et.al. (2007)

Three-sign Combination Frequency Solo (%) Left (%) Middle (%) Right (%)

211 89 336 34 2.94 88.24 5.88 2.94

343 123 293 25 0.00 0.00 0.00 100.00

342 162 249 24 4.17 83.33 8.33 4.17

342 169 249 20 5.00 70.00 20.00 5.00

342 8 171 19 5.26 73.68 5.26 15.79

149 130 51 19 0.00 0.00 78.95 21.05

59 87 99 16 0.00 0.00 100.00 0.00

342 87 403 16 6.25 81.25 6.25 6.25

342 149 130 16 0.00 75.00 25.00 0.00

67 99 267 14 0.00 0.00 7.14 92.86

87 99 267 14 0.00 0.00 21.43 78.57

89 336 72 14 0.00 0.00 85.71 14.29

65 99 267 12 0.00 0.00 8.33 91.67

342 244 67 12 8.33 66.67 8.33 16.67

15 389 178 11 9.09 72.73 0.00 18.18

59 171 53 10 0.00 0.00 60.00 40.00

245 245 25 10 10.00 90.00 0.00 0.00

Positional Analysis of Frequent Three-sign Combinations

29Indus Script: Search for Grammar, Yadav et.al. (2007)

Four-sign Combination Frequency Solo (%) Left (%) Middle (%) Right (%)

342 149 130 51 16 6.25 68.75 6.25 18.75

59 87 99 267 9 0.00 0.00 33.33 66.67

89 336 59 171 6 0.00 0.00 83.33 16.67

15 389 178 98 5 0.00 100.00 0.00 0.00

342 53 230 175 5 20.00 80.00 0.00 0.00

342 169 249 65 5 20.00 20.00 20.00 40.00

211 89 336 72 5 0.00 80.00 0.00 20.00

Positional Analysis of Frequent Four-sign Combinations

30Indus Script: Search for Grammar, Yadav et.al. (2007)

The most frequent two-sign, three-sign and four-sign combinations appear at fixed positions.

The exact location varies from combination to combination.

However, frequently occurring two-sign, three-sign and four-sign combinations may be incomplete except of course when they occur as solo texts.

It can be seen that two-sign, three-sign and four-sign combinations which are complete have typically one of the text-enders (mostly 342 or 211 ) at the end. This is confirmed by the solo occurrences of such texts.

Conclusions from Positional Conclusions from Positional analysisanalysis

31Indus Script: Search for Grammar, Yadav et.al. (2007)

8) Text Beginners and Text 8) Text Beginners and Text EndersEnders

32Indus Script: Search for Grammar, Yadav et.al. (2007)

Indus Text Beginners and Indus Text Beginners and EndersEnders

Enders and Beginners (EBUDS)

0.00

0.20

0.40

0.60

0.80

1.00

1.20

0 20 40 60 80 100 120 140 160 180 200

Number of Signs

Fra

ctio

nal

Cu

mu

lati

ve F

req

uen

cy

Enders

Beginners

33Indus Script: Search for Grammar, Yadav et.al. (2007)

Consider an Indus Text with Signs

G F E D C B AG F E D C B A

(In order of their statistical significance)

FrequentText Enders

FrequentText Beginners

34Indus Script: Search for Grammar, Yadav et.al. (2007)

Specimens of Indus Texts illustrating syntactical patterns

From Mahadevan (1986)

35Indus Script: Search for Grammar, Yadav et.al. (2007)

Conclusions for Indus ScriptConclusions for Indus Script

There are well defined text-enders though text-beginners are not that well-defined.

Sign distribution within the strings seems to be ordered as per some specific rules. The distribution is far more significant than would arise by chance.

This indicates existence of patterns and rules that need to be dug out.

36Indus Script: Search for Grammar, Yadav et.al. (2007)

9) Segmentation of Indus 9) Segmentation of Indus TextsTexts

37Indus Script: Search for Grammar, Yadav et.al. (2007)

SegmentationSegmentation ApproachApproach

There can be various methods which can be used for segmenting anIndus text namely

Comparing texts Using frequent combinations of signs Using Pair Frequencies Using Single Signs (Enders, Beginners, Auxiliary Enders)

These methods are overlapping and hence it is decided to select anapproach which takes into consideration the effect of each of these.A cumulative method based on statistically significant units, is thusformulated.

41Indus Script: Search for Grammar, Yadav et.al. (2007)

Segmentation ProcessSegmentation Process

INDUS TEXT

Look for pair, triplet and quad texts successively

Look for frequent 4, 3 and 2 sign combinations successively

Look for Enders, Beginners and Auxiliary Enders successively

TEXT SEGMENTS

55 % split

77 % split

88 % split

Percent of texts split (for texts of 5 or more signs)

42Indus Script: Search for Grammar, Yadav et.al. (2007)

Length vs Number of Texts or Segments

0

200

400

600

800

1000

1200

1400

1600

1800

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Text or Segment length

Nu

mb

er o

f T

exts

or

Seg

men

ts

EBUDS beforesegmentation

EBUDS aftersegmentation

Segment Length vs. Segment Frequency in EBUDS before and after segmentation

43Indus Script: Search for Grammar, Yadav et.al. (2007)

EBUDS before and after segmentation

EBUDS before Segmentation EBUDS after Segmentation

44Indus Script: Search for Grammar, Yadav et.al. (2007)

Few Examples of Segmentation

45Indus Script: Search for Grammar, Yadav et.al. (2007)

Conclusions from Conclusions from segmentationsegmentation

It is possible to segment 88% of Indus texts of length 5 and above into segments of length 4 and below by using statistically significant signs and their combinations in addition to all the texts of length 2, 3 and 4.

Many frequent sign combinations make their appearance as independent texts.

The Indus texts after segmentation can be viewed as permutations of the identifiable units (segments) of 2, 3 or 4 signs.

The identifiable units may or may not be standalone (or complete) pieces of information.

46Indus Script: Search for Grammar, Yadav et.al. (2007)

10) Summary10) Summary

47Indus Script: Search for Grammar, Yadav et.al. (2007)

SummarySummary The writing is highly ordered.

Typical length of information containing units is 2, 3 or maximum 4 signs.

However, they are not always complete enough to exist as standalone pieces of text.

This suggests a more complex grammar in the writing where information units need proper beginners or enders.

The present study shows that Indus writing seems to have specific ordering as would be expected if sophisticated information is coded. This is consistent with the general level of sophistication associated with the Indus culture.

48Indus Script: Search for Grammar, Yadav et.al. (2007)

EndEnd