Suffix tree construction LCP array
case 1 (
d
(
v
)
=
h
[
i
+
1
]
{\displaystyle d(v)=h[i+1]}
): suppose suffixes
a
$
{\displaystyle a\$}
,
a
n
a
$
{\displaystyle ana\$}
,
a
n
a
n
a
$
{\displaystyle anana\$}
,
b
a
n
a
n
a
$
{\displaystyle banana\$}
of string
s
=
b
a
n
a
n
a
$
{\displaystyle s=banana\$}
added suffix tree. suffix
n
a
$
{\displaystyle na\$}
added tree shown in picture. rightmost path highlighted in red.
start
s
t
0
{\displaystyle st_{0}}
, tree consisting of root. insert
a
[
i
+
1
]
{\displaystyle a[i+1]}
s
t
i
{\displaystyle st_{i}}
, walk rightmost path beginning @ inserted leaf
a
[
i
]
{\displaystyle a[i]}
root, until deepest node
v
{\displaystyle v}
d
(
v
)
≤
h
[
i
+
1
]
{\displaystyle d(v)\leq h[i+1]}
reached.
we need distinguish 2 cases:
d
(
v
)
=
h
[
i
+
1
]
{\displaystyle d(v)=h[i+1]}
: means concatenation of labels on root-to-
v
{\displaystyle v}
path equals longest common prefix of suffixes
a
[
i
]
{\displaystyle a[i]}
,
a
[
i
+
1
]
{\displaystyle a[i+1]}
.
in case, insert
a
[
i
+
1
]
{\displaystyle a[i+1]}
new leaf
x
{\displaystyle x}
of node
v
{\displaystyle v}
, label edge
(
v
,
x
)
{\displaystyle (v,x)}
s
[
a
[
i
+
1
]
+
h
[
i
+
1
]
,
n
]
{\displaystyle s[a[i+1]+h[i+1],n]}
. edge label consists of remaining characters of suffix
a
[
i
+
1
]
{\displaystyle a[i+1]}
not represented concatenation of labels of root-to-
v
{\displaystyle v}
path.
this creates partial suffix tree
s
t
i
+
1
{\displaystyle st_{i+1}}
.
case 2 (
d
(
v
)
<
h
[
i
+
1
]
{\displaystyle d(v)<h[i+1]}
): in order add suffix
n
a
n
a
$
{\displaystyle nana\$}
, edge inserted suffix
n
a
$
{\displaystyle na\$}
has split up. new edge new internal node labeled longest common prefix of suffixes
n
a
$
{\displaystyle na\$}
,
n
a
n
a
$
{\displaystyle nana\$}
. edges connecting 2 leaves labeled remaining suffix characters not part of prefix.
d
(
v
)
<
h
[
i
+
1
]
{\displaystyle d(v)<h[i+1]}
: means concatenation of labels on root-to-
v
{\displaystyle v}
path displays less characters longest common prefix of suffixes
a
[
i
]
{\displaystyle a[i]}
,
a
[
i
+
1
]
{\displaystyle a[i+1]}
, missing characters contained in edge label of
v
{\displaystyle v}
s rightmost edge. therefore, have split edge follows:
let
w
{\displaystyle w}
child of
v
{\displaystyle v}
on
s
t
i
{\displaystyle st_{i}}
s rightmost path.
a simple amortization argument shows running time of algorithm bounded
o
(
n
)
{\displaystyle o(n)}
:
the nodes traversed in step
i
{\displaystyle i}
walking rightmost path of
s
t
i
{\displaystyle st_{i}}
(apart last node
v
{\displaystyle v}
) removed rightmost path, when
a
[
i
+
1
]
{\displaystyle a[i+1]}
added tree new leaf. these nodes never traversed again subsequent steps
j
>
i
{\displaystyle j>i}
. therefore, @
2
n
{\displaystyle 2n}
nodes traversed in total.
Comments
Post a Comment