summaryrefslogtreecommitdiff
path: root/contrib/xml/TODO
diff options
context:
space:
mode:
Diffstat (limited to 'contrib/xml/TODO')
-rw-r--r--contrib/xml/TODO78
1 files changed, 78 insertions, 0 deletions
diff --git a/contrib/xml/TODO b/contrib/xml/TODO
new file mode 100644
index 0000000000..5ddd62a658
--- /dev/null
+++ b/contrib/xml/TODO
@@ -0,0 +1,78 @@
+PGXML TODO List
+===============
+
+Some of these items still require much more thought! Since the first
+release, the XPath support has improved (because I'm no longer using a
+homemade algorithm!).
+
+1. Performance considerations
+
+At present each document is parsed to produce the DOM tree on every query.
+
+Pros:
+ Easy
+ No persistent memory or storage allocation for parsed trees
+ (libxml docs suggest representation of a document might
+ be 4 times the size of the text)
+
+Cons:
+ Slow/ CPU intensive to parse.
+ Makes it difficult for PLs to apply libxml manipulations to create
+ new documents or amend existing ones.
+
+
+2. XQuery
+
+I'm not sure if the addition of XQuery would be best as a function or
+as a new front-end parser. This is one to think about, but with a
+decent implementation of XPath, one of the prerequisites is covered.
+
+3. DOM Interfaces
+
+Expose more aspects of the DOM to user functions/ PLs. This would
+allow a procedure in a PL to run some queries and then use exposed
+interfaces to libxml to create an XML document out of the query
+results. I accept the argument that this might be more properly
+performed on the client side.
+
+4. Returning sets of documents from XPath queries.
+
+Although the current implementation allows you to amalgamate the
+returned results into a single document, it's quite possible that
+you'd like to use the returned set of nodes as a source for FROM.
+
+Is there a good way to optimise/index the results of certain XPath
+operations to make them faster?:
+
+select docid, pgxml_xpath(document,'//site/location/text()','','') as location
+where pgxml_xpath(document,'//site/name/text()','','') = 'Church Farm';
+
+and with multiple element occurences in a document?
+
+select d.docid, pgxml_xpath(d.document,'//site/location/text()','','')
+from docstore d,
+pgxml_xpaths('docstore','document','//feature/type/text()','docid') ft
+where ft.key = d.docid and ft.value ='Limekiln';
+
+pgxml_xpaths params are relname, attrname, xpath, returnkey. It would
+return a set of two-element tuples (key,value) consisting of the value of
+returnkey, and the cdata value of the xpath. The XML document would be
+defined by relname and attrname.
+
+The pgxml_xpaths function could be the basis of a functional index,
+which could speed up the above query very substantially, working
+through the normal query planner mechanism.
+
+5. Return type support.
+
+Better support for returning e.g. numeric or boolean values. I need to
+get to grips with the returned data from libxml first.
+
+
+John Gray <[email protected]> 16 August 2001
+
+
+
+
+
+