Longest common substring in linear time The Next CEO of Stack OverflowComputing the longest common substring of two strings using suffix arraysNumber of distinct substrings in a stringSuffix Tree algorithm complexityComputing the longest common substring of two strings using suffix arraysFind longest common substring using a rolling hashWhich algorithm to use to find all common substring (LCS case) with really big stringsFinding the longest repeating subsequenceHow to find longest recurring pattern from lage string data set?Longest substring with consecutive repetitionsSubstring problems in suffix treesWhat is correct time complexity of the substring generation algoNumber of optimal solutions for Longest Common Subsequence (Substring) problemLongest common sequence matrix giving wrong answer
If Nick Fury and Coulson already knew about aliens (Kree and Skrull) why did they wait until Thor's appearance to start making weapons?
Can someone explain this formula for calculating Manhattan distance?
Prepend last line of stdin to entire stdin
Is dried pee considered dirt?
Is a distribution that is normal, but highly skewed, considered Gaussian?
What difference does it make using sed with/without whitespaces?
How to get the last not-null value in an ordered column of a huge table?
What happened in Rome, when the western empire "fell"?
Reference request: Grassmannian and Plucker coordinates in type B, C, D
Traduction de « Life is a roller coaster »
Defamation due to breach of confidentiality
What was the first Unix version to run on a microcomputer?
Does Germany produce more waste than the US?
My ex-girlfriend uses my Apple ID to login to her iPad, do I have to give her my Apple ID password to reset it?
Is French Guiana a (hard) EU border?
How to find image of a complex function with given constraints?
What day is it again?
New carbon wheel brake pads after use on aluminum wheel?
Traveling with my 5 year old daughter (as the father) without the mother from Germany to Mexico
Won the lottery - how do I keep the money?
Is it ever safe to open a suspicious HTML file (e.g. email attachment)?
From jafe to El-Guest
How many extra stops do monopods offer for tele photographs?
Which one is the true statement?
Longest common substring in linear time
The Next CEO of Stack OverflowComputing the longest common substring of two strings using suffix arraysNumber of distinct substrings in a stringSuffix Tree algorithm complexityComputing the longest common substring of two strings using suffix arraysFind longest common substring using a rolling hashWhich algorithm to use to find all common substring (LCS case) with really big stringsFinding the longest repeating subsequenceHow to find longest recurring pattern from lage string data set?Longest substring with consecutive repetitionsSubstring problems in suffix treesWhat is correct time complexity of the substring generation algoNumber of optimal solutions for Longest Common Subsequence (Substring) problemLongest common sequence matrix giving wrong answer
$begingroup$
We know that the longest common substring of two strings can be found in $mathcal O(N^2)$ time complexity.
Can a solution be found in only linear time?
algorithms time-complexity strings longest-common-substring
$endgroup$
add a comment |
$begingroup$
We know that the longest common substring of two strings can be found in $mathcal O(N^2)$ time complexity.
Can a solution be found in only linear time?
algorithms time-complexity strings longest-common-substring
$endgroup$
add a comment |
$begingroup$
We know that the longest common substring of two strings can be found in $mathcal O(N^2)$ time complexity.
Can a solution be found in only linear time?
algorithms time-complexity strings longest-common-substring
$endgroup$
We know that the longest common substring of two strings can be found in $mathcal O(N^2)$ time complexity.
Can a solution be found in only linear time?
algorithms time-complexity strings longest-common-substring
algorithms time-complexity strings longest-common-substring
edited Mar 25 at 4:01
Glorfindel
2341311
2341311
asked Mar 23 at 22:44
Manoharsinh RanaManoharsinh Rana
1278
1278
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Let $m$ and $n$ be the lengths of two given strings,
Linear time assuming the size of the alphabet is constant.
Yes, the longest common substring of two given strings can be found in $O(m+n)$ time, assuming the size of the alphabet is constant.
Here is an excerpt from Wikipedia article on longest common substring problem.
The longest common substrings of a set of strings can be found by building a generalized suffix tree for the strings, and then finding the deepest internal nodes which have leaf nodes from all the strings in the subtree below it.
Building a generalized suffix tree for two given strings takes $O(m+n)$ time using the famous ingenious Ukkonen's algorithm. Finding the deepest internal nodes that come from both strings takes $O(m+n)$ time. Hence we can find the longest common substring in $O(m+n)$ time.
For a working implementation, please take a look at Suffix Tree Application 5 – Longest Common Substring at GeeksforGeeks
(Improved!) Linear time
In fact, the longest common substring of two given strings can be found in $O(m+n)$ time regardless of the size of the alphabet.
Here is the abstract of Computing Longest Common Substrings Via Suffix Arrays by Babenko, Maxim & Starikovskaya, Tatiana. (2008).
Given a set of $N$ strings $A = alpha_1,cdots,alpha_N$ of total length $n$ over alphabet $Sigma$ one may ask to find, for each $2 le kle N$, the longest substring $beta$ that appears in at least $K$ strings in $A$. It is known that this problem can be solved in $O(n)$ time with the help of suffix trees. However, the resulting algorithm is rather complicated (in particular, it involves answering certain least common ancestor queries in $O(1)$ time). Also, its running time and memory consumption may depend on $|Sigma|$.
This paper presents an alternative, remarkably simple approach to
the above problem, which relies on the notion of suffix arrays. Once
the suffix array of some auxiliary $O(n)$-length string is computed, one
needs a simple $O(n)$-time postprocessing to find the requested longest
substring. Since a number of efficient and simple linear-time algorithms
for constructing suffix arrays has been recently developed (with constant
not depending on $|Sigma|$), our approach seems to be quite practical.
Here is the general idea of the algorithm in the paper above. Let string $alpha$ be concatenation of all $alpha_i$ with separating sentinels. Construct the suffix array for $α$ as well as its longest-common-prefix array. Apply a sliding window technique to these arrays to obtain the longest common substrings.
$endgroup$
add a comment |
$begingroup$
Yes. There's even a Wikipedia article about it! https://en.wikipedia.org/wiki/Longest_common_substring_problem
In particular, as Wikipedia explains, there is a linear-time algorithm, using suffix trees (or suffix arrays).
Searching on "longest common substring" turns up that Wikipedia article as the first hit (for me). In the future, please research the problem before asking here. (See, e.g., https://meta.stackoverflow.com/q/261592/781723.)
$endgroup$
add a comment |
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "419"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f105969%2flongest-common-substring-in-linear-time%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Let $m$ and $n$ be the lengths of two given strings,
Linear time assuming the size of the alphabet is constant.
Yes, the longest common substring of two given strings can be found in $O(m+n)$ time, assuming the size of the alphabet is constant.
Here is an excerpt from Wikipedia article on longest common substring problem.
The longest common substrings of a set of strings can be found by building a generalized suffix tree for the strings, and then finding the deepest internal nodes which have leaf nodes from all the strings in the subtree below it.
Building a generalized suffix tree for two given strings takes $O(m+n)$ time using the famous ingenious Ukkonen's algorithm. Finding the deepest internal nodes that come from both strings takes $O(m+n)$ time. Hence we can find the longest common substring in $O(m+n)$ time.
For a working implementation, please take a look at Suffix Tree Application 5 – Longest Common Substring at GeeksforGeeks
(Improved!) Linear time
In fact, the longest common substring of two given strings can be found in $O(m+n)$ time regardless of the size of the alphabet.
Here is the abstract of Computing Longest Common Substrings Via Suffix Arrays by Babenko, Maxim & Starikovskaya, Tatiana. (2008).
Given a set of $N$ strings $A = alpha_1,cdots,alpha_N$ of total length $n$ over alphabet $Sigma$ one may ask to find, for each $2 le kle N$, the longest substring $beta$ that appears in at least $K$ strings in $A$. It is known that this problem can be solved in $O(n)$ time with the help of suffix trees. However, the resulting algorithm is rather complicated (in particular, it involves answering certain least common ancestor queries in $O(1)$ time). Also, its running time and memory consumption may depend on $|Sigma|$.
This paper presents an alternative, remarkably simple approach to
the above problem, which relies on the notion of suffix arrays. Once
the suffix array of some auxiliary $O(n)$-length string is computed, one
needs a simple $O(n)$-time postprocessing to find the requested longest
substring. Since a number of efficient and simple linear-time algorithms
for constructing suffix arrays has been recently developed (with constant
not depending on $|Sigma|$), our approach seems to be quite practical.
Here is the general idea of the algorithm in the paper above. Let string $alpha$ be concatenation of all $alpha_i$ with separating sentinels. Construct the suffix array for $α$ as well as its longest-common-prefix array. Apply a sliding window technique to these arrays to obtain the longest common substrings.
$endgroup$
add a comment |
$begingroup$
Let $m$ and $n$ be the lengths of two given strings,
Linear time assuming the size of the alphabet is constant.
Yes, the longest common substring of two given strings can be found in $O(m+n)$ time, assuming the size of the alphabet is constant.
Here is an excerpt from Wikipedia article on longest common substring problem.
The longest common substrings of a set of strings can be found by building a generalized suffix tree for the strings, and then finding the deepest internal nodes which have leaf nodes from all the strings in the subtree below it.
Building a generalized suffix tree for two given strings takes $O(m+n)$ time using the famous ingenious Ukkonen's algorithm. Finding the deepest internal nodes that come from both strings takes $O(m+n)$ time. Hence we can find the longest common substring in $O(m+n)$ time.
For a working implementation, please take a look at Suffix Tree Application 5 – Longest Common Substring at GeeksforGeeks
(Improved!) Linear time
In fact, the longest common substring of two given strings can be found in $O(m+n)$ time regardless of the size of the alphabet.
Here is the abstract of Computing Longest Common Substrings Via Suffix Arrays by Babenko, Maxim & Starikovskaya, Tatiana. (2008).
Given a set of $N$ strings $A = alpha_1,cdots,alpha_N$ of total length $n$ over alphabet $Sigma$ one may ask to find, for each $2 le kle N$, the longest substring $beta$ that appears in at least $K$ strings in $A$. It is known that this problem can be solved in $O(n)$ time with the help of suffix trees. However, the resulting algorithm is rather complicated (in particular, it involves answering certain least common ancestor queries in $O(1)$ time). Also, its running time and memory consumption may depend on $|Sigma|$.
This paper presents an alternative, remarkably simple approach to
the above problem, which relies on the notion of suffix arrays. Once
the suffix array of some auxiliary $O(n)$-length string is computed, one
needs a simple $O(n)$-time postprocessing to find the requested longest
substring. Since a number of efficient and simple linear-time algorithms
for constructing suffix arrays has been recently developed (with constant
not depending on $|Sigma|$), our approach seems to be quite practical.
Here is the general idea of the algorithm in the paper above. Let string $alpha$ be concatenation of all $alpha_i$ with separating sentinels. Construct the suffix array for $α$ as well as its longest-common-prefix array. Apply a sliding window technique to these arrays to obtain the longest common substrings.
$endgroup$
add a comment |
$begingroup$
Let $m$ and $n$ be the lengths of two given strings,
Linear time assuming the size of the alphabet is constant.
Yes, the longest common substring of two given strings can be found in $O(m+n)$ time, assuming the size of the alphabet is constant.
Here is an excerpt from Wikipedia article on longest common substring problem.
The longest common substrings of a set of strings can be found by building a generalized suffix tree for the strings, and then finding the deepest internal nodes which have leaf nodes from all the strings in the subtree below it.
Building a generalized suffix tree for two given strings takes $O(m+n)$ time using the famous ingenious Ukkonen's algorithm. Finding the deepest internal nodes that come from both strings takes $O(m+n)$ time. Hence we can find the longest common substring in $O(m+n)$ time.
For a working implementation, please take a look at Suffix Tree Application 5 – Longest Common Substring at GeeksforGeeks
(Improved!) Linear time
In fact, the longest common substring of two given strings can be found in $O(m+n)$ time regardless of the size of the alphabet.
Here is the abstract of Computing Longest Common Substrings Via Suffix Arrays by Babenko, Maxim & Starikovskaya, Tatiana. (2008).
Given a set of $N$ strings $A = alpha_1,cdots,alpha_N$ of total length $n$ over alphabet $Sigma$ one may ask to find, for each $2 le kle N$, the longest substring $beta$ that appears in at least $K$ strings in $A$. It is known that this problem can be solved in $O(n)$ time with the help of suffix trees. However, the resulting algorithm is rather complicated (in particular, it involves answering certain least common ancestor queries in $O(1)$ time). Also, its running time and memory consumption may depend on $|Sigma|$.
This paper presents an alternative, remarkably simple approach to
the above problem, which relies on the notion of suffix arrays. Once
the suffix array of some auxiliary $O(n)$-length string is computed, one
needs a simple $O(n)$-time postprocessing to find the requested longest
substring. Since a number of efficient and simple linear-time algorithms
for constructing suffix arrays has been recently developed (with constant
not depending on $|Sigma|$), our approach seems to be quite practical.
Here is the general idea of the algorithm in the paper above. Let string $alpha$ be concatenation of all $alpha_i$ with separating sentinels. Construct the suffix array for $α$ as well as its longest-common-prefix array. Apply a sliding window technique to these arrays to obtain the longest common substrings.
$endgroup$
Let $m$ and $n$ be the lengths of two given strings,
Linear time assuming the size of the alphabet is constant.
Yes, the longest common substring of two given strings can be found in $O(m+n)$ time, assuming the size of the alphabet is constant.
Here is an excerpt from Wikipedia article on longest common substring problem.
The longest common substrings of a set of strings can be found by building a generalized suffix tree for the strings, and then finding the deepest internal nodes which have leaf nodes from all the strings in the subtree below it.
Building a generalized suffix tree for two given strings takes $O(m+n)$ time using the famous ingenious Ukkonen's algorithm. Finding the deepest internal nodes that come from both strings takes $O(m+n)$ time. Hence we can find the longest common substring in $O(m+n)$ time.
For a working implementation, please take a look at Suffix Tree Application 5 – Longest Common Substring at GeeksforGeeks
(Improved!) Linear time
In fact, the longest common substring of two given strings can be found in $O(m+n)$ time regardless of the size of the alphabet.
Here is the abstract of Computing Longest Common Substrings Via Suffix Arrays by Babenko, Maxim & Starikovskaya, Tatiana. (2008).
Given a set of $N$ strings $A = alpha_1,cdots,alpha_N$ of total length $n$ over alphabet $Sigma$ one may ask to find, for each $2 le kle N$, the longest substring $beta$ that appears in at least $K$ strings in $A$. It is known that this problem can be solved in $O(n)$ time with the help of suffix trees. However, the resulting algorithm is rather complicated (in particular, it involves answering certain least common ancestor queries in $O(1)$ time). Also, its running time and memory consumption may depend on $|Sigma|$.
This paper presents an alternative, remarkably simple approach to
the above problem, which relies on the notion of suffix arrays. Once
the suffix array of some auxiliary $O(n)$-length string is computed, one
needs a simple $O(n)$-time postprocessing to find the requested longest
substring. Since a number of efficient and simple linear-time algorithms
for constructing suffix arrays has been recently developed (with constant
not depending on $|Sigma|$), our approach seems to be quite practical.
Here is the general idea of the algorithm in the paper above. Let string $alpha$ be concatenation of all $alpha_i$ with separating sentinels. Construct the suffix array for $α$ as well as its longest-common-prefix array. Apply a sliding window technique to these arrays to obtain the longest common substrings.
edited Mar 24 at 17:24
answered Mar 24 at 0:20
Apass.JackApass.Jack
13.7k1940
13.7k1940
add a comment |
add a comment |
$begingroup$
Yes. There's even a Wikipedia article about it! https://en.wikipedia.org/wiki/Longest_common_substring_problem
In particular, as Wikipedia explains, there is a linear-time algorithm, using suffix trees (or suffix arrays).
Searching on "longest common substring" turns up that Wikipedia article as the first hit (for me). In the future, please research the problem before asking here. (See, e.g., https://meta.stackoverflow.com/q/261592/781723.)
$endgroup$
add a comment |
$begingroup$
Yes. There's even a Wikipedia article about it! https://en.wikipedia.org/wiki/Longest_common_substring_problem
In particular, as Wikipedia explains, there is a linear-time algorithm, using suffix trees (or suffix arrays).
Searching on "longest common substring" turns up that Wikipedia article as the first hit (for me). In the future, please research the problem before asking here. (See, e.g., https://meta.stackoverflow.com/q/261592/781723.)
$endgroup$
add a comment |
$begingroup$
Yes. There's even a Wikipedia article about it! https://en.wikipedia.org/wiki/Longest_common_substring_problem
In particular, as Wikipedia explains, there is a linear-time algorithm, using suffix trees (or suffix arrays).
Searching on "longest common substring" turns up that Wikipedia article as the first hit (for me). In the future, please research the problem before asking here. (See, e.g., https://meta.stackoverflow.com/q/261592/781723.)
$endgroup$
Yes. There's even a Wikipedia article about it! https://en.wikipedia.org/wiki/Longest_common_substring_problem
In particular, as Wikipedia explains, there is a linear-time algorithm, using suffix trees (or suffix arrays).
Searching on "longest common substring" turns up that Wikipedia article as the first hit (for me). In the future, please research the problem before asking here. (See, e.g., https://meta.stackoverflow.com/q/261592/781723.)
answered Mar 24 at 0:01
D.W.♦D.W.
103k12129293
103k12129293
add a comment |
add a comment |
Thanks for contributing an answer to Computer Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f105969%2flongest-common-substring-in-linear-time%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown