Estimation of Stochastic Attribute-Value Grammars using an Informative Sample

Osborne, Miles

Computer Science > Computation and Language

arXiv:cs/0008024 (cs)

[Submitted on 23 Aug 2000]

Title:Estimation of Stochastic Attribute-Value Grammars using an Informative Sample

Authors:Miles Osborne

View PDF

Abstract: We argue that some of the computational complexity associated with estimation of stochastic attribute-value grammars can be reduced by training upon an informative subset of the full training set. Results using the parsed Wall Street Journal corpus show that in some circumstances, it is possible to obtain better estimation results using an informative sample than when training upon all the available material. Further experimentation demonstrates that with unlexicalised models, a Gaussian Prior can reduce overfitting. However, when models are lexicalised and contain overlapping features, overfitting does not seem to be a problem, and a Gaussian Prior makes minimal difference to performance. Our approach is applicable for situations when there are an infeasibly large number of parses in the training set, or else for when recovery of these parses from a packed representation is itself computationally expensive.

Comments:	6 pages, 2 figures. Coling 2000, Saarbrücken, Germany. pp 586--592
Subjects:	Computation and Language (cs.CL)
ACM classes:	I.2.6
Cite as:	arXiv:cs/0008024 [cs.CL]
	(or arXiv:cs/0008024v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.cs/0008024
Journal reference:	Coling 2000, Saarbrücken, Germany. pp 586--592

Submission history

From: Miles Osborne [view email]
[v1] Wed, 23 Aug 2000 12:38:08 UTC (23 KB)

Computer Science > Computation and Language

Title:Estimation of Stochastic Attribute-Value Grammars using an Informative Sample

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Estimation of Stochastic Attribute-Value Grammars using an Informative Sample

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators