# The Dyck Language Edit Distance Problem in Near-Linear Time

@article{Saha2014TheDL, title={The Dyck Language Edit Distance Problem in Near-Linear Time}, author={B. Saha}, journal={2014 IEEE 55th Annual Symposium on Foundations of Computer Science}, year={2014}, pages={611-620} }

Given a string σ over alphabet Σ and a grammar G defined over the same alphabet, how many minimum number of repairs (insertions, deletions and substitutions) are required to map σ into a valid member of G? The seminal work of Aho and Peterson in 1972 initiated the study of this language edit distance problem providing a dynamic programming algorithm for context free languages that runs in O(|G|2n3) time, where n is the string length and G is the grammar size. While later improvements reduced… Expand

#### Figures and Topics from this paper

#### 26 Citations

Faster Language Edit Distance, Connection to All-pairs Shortest Paths and Related Problems

- Mathematics, Computer Science
- ArXiv
- 2014

It is shown that exact computation of language edit distance in true sub-cubic time will imply a truly sub- cubic algorithm for all-pairs shortest paths, a long-standing open question and result in a breakthrough for a large range of problems in graphs and matrices due to sub-Cubic equivalence. Expand

Language Edit Distance and Maximum Likelihood Parsing of Stochastic Grammars: Faster Algorithms and Connection to Fundamental Graph Problems

- Mathematics, Computer Science
- 2015 IEEE 56th Annual Symposium on Foundations of Computer Science
- 2015

This paper gives the first such algorithm that computes language edit distance almost optimally and designs the very first subcubic (Õ(nω)) algorithm that given an arbitrary stochastic context free grammar, and a string returns a nearly-optimal maximum likelihood parsing of that string. Expand

Language Edit Distance Approximation via Amnesic Dynamic Programming

- 2016

In 1975, a breakthrough result of L. Valiant showed that parsing context free grammars can be reduced to Boolean matrix multiplication, resulting in a running time of O(n) for parsing where ω ≤ 2.373… Expand

Approximating Language Edit Distance Beyond Fast Matrix Multiplication: Ultralinear Grammars Are Where Parsing Becomes Hard!

- Mathematics, Computer Science
- ICALP
- 2017

Additive approximation algorithms for language edit distance are studied, providing two explicit combinatorial algorithms to obtain a string with minimum edit distance with performance dependencies on either the number of non-linear productions, k^*, or theNumber of nested non- linear production, k, used in the optimal derivation. Expand

Sublinear Algorithms for Gap Edit Distance

- Computer Science, Mathematics
- 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS)
- 2019

An algorithm for distinguishing whether the edit distance is at most t or at least t^2 (the quadratic gap problem) in time Õ(n/t+t^3). Expand

Improved bounds for testing Dyck languages

- Computer Science, Mathematics
- SODA
- 2018

This paper introduces a new problem called Truestring Equivalence, which is easily reducible to the $2$-type Dyck language property testing problem, and shows a lower bound of $n$ to the power of $1/5$. Expand

Fast & Space-Efficient Approximations of Language Edit Distance and RNA Folding: An Amnesic Dynamic Programming Approach

- Mathematics, Computer Science
- 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)
- 2017

These are the first algorithms that break the cubic-time barriers of any combinatorial algorithm, and quadratic-space barrier of any algorithm significantly improving upon their long-standing space and time complexities. Expand

If the Current Clique Algorithms are Optimal, So is Valiant's Parser

- Computer Science, Mathematics
- 2015 IEEE 56th Annual Symposium on Foundations of Computer Science
- 2015

It is proved that any improvement on Valiant' s algorithm, even for constant size grammars, would imply a breakthrough algorithm for the k-Clique problem: given a graph on n nodes, decide if there are k that form a clique. Expand

Approximating Edit Distance within Constant Factor in Truly Sub-Quadratic Time

- Computer Science, Mathematics
- 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS)
- 2018

An algorithm with running time Õ(n^2-2/7) that approximates the edit distance within a constant factor poly(log n) is provided. Expand

C C ] 2 O ct 2 01 9 Sublinear Algorithms for Gap Edit Distance ∗

- 2019

The edit distance is a way of quantifying how similar two strings are to one another by counting the minimum number of character insertions, deletions, and substitutions required to transform one… Expand

#### References

SHOWING 1-10 OF 38 REFERENCES

Testing membership in parenthesis languages

- Mathematics, Computer Science
- Random Struct. Algorithms
- 2003

A sublinear-time algorithm for testing all Dyck languages and a testing algorithm for the context free language LREV, where Σ is a fixed alphabet, which almost matches the lower bound given by Alon et al. Expand

Approximating edit distance efficiently

- Mathematics, Computer Science
- 45th Annual IEEE Symposium on Foundations of Computer Science
- 2004

Algorithms are developed that solve gap versions of the edit distance problem: given two strings of length n with the promise that their edit distance is either at most k or greater than /spl lscr/, decide which of the two holds and develop an n/sup 3/7/-approximation quasilinear time algorithm. Expand

Incremental String Comparison

- Computer Science, Mathematics
- SIAM J. Comput.
- 1998

This paper considers the following incremental version of comparing two sequences A and B to determine their longest common subsequence (LCS) or the edit distance between them, and obtains O(nk) algorithms for the longest prefix approximate match problem, the approximate overlap problem, and cyclic string comparison. Expand

Polylogarithmic Approximation for Edit Distance and the Asymmetric Query Complexity

- Mathematics, Computer Science
- 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
- 2010

The lower bound is the first to expose hardness of edit distance stemming from the input strings being ``repetitive'', which means that many of their substrings are approximately identical, and provides the first rigorous separation between edit distance and Ulam distance. Expand

Approximately Matching Context-Free Languages

- Mathematics, Computer Science
- Inf. Process. Lett.
- 1995

An O(PN2(N + log P)) algorithm for approximately matching a string of length N and a context-free language specified by a grammar of size P is given, which generalizes the Cocke-Younger-Kasami algorithm for determining membership in a context -free language. Expand

Approximating edit distance in near-linear time

- Mathematics, Computer Science
- STOC '09
- 2009

This is the first sub-polynomial approximation algorithm for this problem that runs in near-linear time, improving on the state-of-the-art n<sup>(1/3+o(1)) approximation. Expand

Regular languages are testable with a constant number of queries

- Computer Science
- 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039)
- 1999

This paper discusses testability of more complex languages and shows that the query complexity required for testing context free languages cannot be bounded by any function of /spl epsiv/. Expand

Recognizing well-parenthesized expressions in the streaming model

- Computer Science, Mathematics
- STOC '10
- 2010

It is proved that this one-pass algorithm for Dyck(2) is optimal, up to a log(n) factor, even when two-sided error is allowed, and conjecture that a similar bound holds for any constant number of passes over the input. Expand

An Introduction to Formal Language Theory

- Computer Science
- Texts and Monographs in Computer Science
- 1988

This volume intended to serve as a text for upper undergraduate and graduate level students and special emphasis is given to the role of algebraic techniques in formal language theory through a chapter devoted to the fixed point approach to the analysis of context-free languages. Expand

Streaming Algorithms for Recognizing Nearly Well-Parenthesized Expressions

- Mathematics, Computer Science
- MFCS
- 2011

It is proved that any randomized one-pass algorithm that makes error at most k/n requires at least Ω(k log(n/k))) space to accept strings which are exactly k-away from strings in 1-turn-Dyck2 and to reject strings which is exactly k + 2-awayfrom strings in1- turn-DYck2. Expand