Inference algorithms for pattern-based CRFs on sequence data

Takhanov, Rustem; Kolmogorov, Vladimir

doi:10.1007/s00453-015-0017-7

Computer Science > Machine Learning

arXiv:1210.0508v5 (cs)

[Submitted on 1 Oct 2012 (v1), last revised 20 Jan 2017 (this version, v5)]

Title:Inference algorithms for pattern-based CRFs on sequence data

Authors:Rustem Takhanov, Vladimir Kolmogorov

View PDF

Abstract:We consider Conditional Random Fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) $x_1...x_n$ is the sum of terms over intervals $[i,j]$ where each term is non-zero only if the substring $x_i...x_j$ equals a prespecified pattern $\alpha$. Such CRFs can be naturally applied to many sequence tagging problems.
We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively $O(n L)$, $O(n L \ell_{max})$ and $O(n L \min\{|D|,\log (\ell_{max}+1)\})$ where $L$ is the combined length of input patterns, $\ell_{max}$ is the maximum length of a pattern, and $D$ is the input alphabet. This improves on the previous algorithms of (Ye et al., 2009) whose complexities are respectively $O(n L |D|)$, $O(n |\Gamma| L^2 \ell_{max}^2)$ and $O(n L |D|)$, where $|\Gamma|$ is the number of input patterns.
In addition, we give an efficient algorithm for sampling. Finally, we consider the case of non-positive weights. (Komodakis & Paragios, 2009) gave an $O(n L)$ algorithm for computing the MAP. We present a modification that has the same worst-case complexity but can beat it in the best case.

Comments:	Algorithmica accepted version
Subjects:	Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS)
Cite as:	arXiv:1210.0508 [cs.LG]
	(or arXiv:1210.0508v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1210.0508
Journal reference:	Algorithmica, September 2016, Volume 76, Issue 1, pp 17-46
Related DOI:	https://doi.org/10.1007/s00453-015-0017-7

Submission history

From: Vladimir Kolmogorov [view email]
[v1] Mon, 1 Oct 2012 19:13:59 UTC (63 KB)
[v2] Tue, 23 Oct 2012 16:16:58 UTC (64 KB)
[v3] Thu, 8 Nov 2012 00:11:30 UTC (65 KB)
[v4] Sat, 29 Dec 2012 22:13:01 UTC (72 KB)
[v5] Fri, 20 Jan 2017 08:00:44 UTC (60 KB)

Computer Science > Machine Learning

Title:Inference algorithms for pattern-based CRFs on sequence data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Inference algorithms for pattern-based CRFs on sequence data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators