r48938 MediaWiki - Code Review archive

Repository:MediaWiki
Revision:r48937‎ | r48938 | r48939 >
Date:22:01, 27 March 2009
Author:demon
Status:resolved (Comments)
Tags:
Comment:
(bug 17374) Special:Export no longer exports two copies of the same page
Modified paths:
  • /trunk/phase3/RELEASE-NOTES (modified) (history)
  • /trunk/phase3/includes/specials/SpecialExport.php (modified) (history)

Diff [purge]

Index: trunk/phase3/includes/specials/SpecialExport.php
@@ -210,7 +210,13 @@
211211 */
212212
213213 $pages = array_keys( $pageSet );
214 -
 214+
 215+ // Normalize titles to the same format and remove dupes, see bug 17374
 216+ foreach( $pages as $k => $v ) {
 217+ $pages[$k] = str_replace( " ", "_", $v );
 218+ }
 219+ $pages = array_unique( $pages );
 220+
215221 /* Ok, let's get to it... */
216222 if( $history == WikiExporter::CURRENT ) {
217223 $lb = false;
Index: trunk/phase3/RELEASE-NOTES
@@ -294,6 +294,7 @@
295295 * (bug 18031) Make namespace selector on Special:Export remember the previous
296296 selection
297297 * The svn-version version numbers on Special:Version have been removed
 298+* (bug 17374) Special:Export no longer exports two copies of the same page
298299
299300 == API changes in 1.15 ==
300301 * (bug 16858) Revamped list=deletedrevs to make listing deleted contributions

Follow-up revisions

RevisionCommit summaryAuthorDate
r53521* (bug 17374) Special:Export no longer exports multiple copies of pages...brion02:47, 20 July 2009

Comments

#Comment by Tim Starling (talk | contribs)   10:18, 4 May 2009

This should not be necessary, it indicates DB corruption.

#Comment by Tim Starling (talk | contribs)   10:56, 4 May 2009

Actually there's no DB corruption, just sloppy code. Your normalisation is inferior to the one in Title.php, and there's plenty of ways to export duplicate titles by exploiting the differences between the two algorithms. The user input titles need to be properly sanitized using Title::newFromText(). SpecialExport::getLinks() adds the input titles to the array, this is incorrect and should be done by the caller instead, if there are indeed any cases where it is necessary. Unnecessary validation of titles from the addns and addcat features could be avoided by reordering the input phase, sanitizing the pages parameter and constructing the array earlier.

Status & tagging log