Jump to content

API:Import

From mediawiki.org
MediaWiki version:
1.15

POST request to import a page from another wiki (transwikiing) or from an xml file.

API documentation

[edit]

action=import

(main | import)
  • This module requires read rights.
  • This module requires write rights.
  • This module only accepts POST requests.
  • Source: MediaWiki
  • License: GPL-2.0-or-later

Import a page from another wiki, or from an XML file.

Note that the HTTP POST must be done as a file upload (i.e. using multipart/form-data) when sending a file for the xml parameter.

Specific parameters:
Other general parameters are available.
summary

Log entry import summary.

xml

Uploaded XML file.

Must be posted as a file upload using multipart/form-data.
interwikiprefix

For uploaded imports: interwiki prefix to apply to unknown usernames (and known users if assignknownusers is set).

interwikisource

For interwiki imports: wiki to import from.

One of the following values: meta, usability, w:en, wikitech
interwikipage

For interwiki imports: page to import.

fullhistory

For interwiki imports: import the full history, not just the current version.

Type: boolean (details)
templates

For interwiki imports: import all included templates as well.

Type: boolean (details)
namespace

Import to this namespace. Cannot be used together with rootpage.

One of the following values: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 90, 91, 92, 93, 100, 101, 102, 103, 104, 105, 106, 107, 486, 487, 710, 711, 828, 829, 1198, 1199, 2600, 5500, 5501
assignknownusers

Assign edits to local users where the named user exists locally.

Type: boolean (details)
rootpage

Import as subpage of this page. Cannot be used together with namespace.

tags

Change tags to apply to the entry in the import log and to the null revision on the imported pages.

Values (separate with | or alternative): AWB, convenient-discussions
token

A "csrf" token retrieved from action=query&meta=tokens

This parameter is required.


Import process

[edit]

Importing a page is a multi-step process:

  1. Log in using one of the methods described in API:Login .
  2. GET a CSRF token . This token is the same for all pages but changes at every login.
  3. Send a POST request with the CSRF token in order to import the page.

The code samples below cover the third step in detail.

Example 1: Import a page from another wiki

[edit]

POST request

[edit]
Import Help:Extension:ParserFunctions to the Manual namespace (namespace 100) with full history.

Response

[edit]
{
  "import": [
    {
      "ns": 12, 
      "revisions": 639, 
      "title": "Help:ParserFunctions"
    }
  ]
}

Sample code 1

[edit]

Python

[edit]
#!/usr/bin/python3

"""
    import_interwiki.py

    MediaWiki Action API Code Samples
    Demo of `Import` module: Import a page from another wiki by
    specifying its title
    MIT license
"""

import requests

S = requests.Session()

URL = "https://test.wikipedia.org/w/api.php"

# Step 1: Retrieve a login token
PARAMS_1 = {
    "action": "query",
    "meta": "tokens",
    "type": "login",
    "format": "json"
}

R = S.get(url=URL, params=PARAMS_1)
DATA = R.json()

LOGIN_TOKEN = DATA['query']['tokens']['logintoken']

# Step 2: Send a post request to log in using the clientlogin method.
# import rights can't be granted using Special:BotPasswords
# hence using bot passwords may not work.
# See https://www.mediawiki.org/wiki/API:Login for more
# information on log in methods.
PARAMS_2 = {
    "action":"clientlogin",
    "username":"username",
    "password":"password",
    'loginreturnurl': 'http://127.0.0.1:5000/',
    "format":"json",
    "logintoken":LOGIN_TOKEN
}

R = S.post(URL, data=PARAMS_2)

# Step 3: While logged in, retrieve a CSRF token
PARAMS_3 = {
    "action": "query",
    "meta": "tokens",
    "format": "json"
}

R = S.get(url=URL, params=PARAMS_3)
DATA = R.json()

CSRF_TOKEN = DATA['query']['tokens']['csrftoken']

# Step 4: Post request to import page from another wiki
PARAMS_4 = {
    "action": "import",
    "format": "json",
    "interwikisource": "meta",
    "interwikipage": "Help:ParserFunctions",
    "fullhistory":"true",
    "namespace":"100",
    "token": CSRF_TOKEN
}

R = S.post(url=URL, data=PARAMS_4)
DATA = R.json()

print(DATA)

PHP

[edit]
<?php

/*
    import_interwiki.php

    MediaWiki API Demos
    Demo of `Import` module: Import a page from another wiki by
	specifying its title

    MIT license
*/

$endPoint = "http://dev.wiki.local.wmftest.net:8080/w/api.php";

$login_Token = getLoginToken(); // Step 1
loginRequest( $login_Token ); // Step 2
$csrf_Token = getCSRFToken(); // Step 3
import( $csrf_Token ); // Step 4

// Step 1: GET request to fetch login token
function getLoginToken() {
	global $endPoint;

	$params1 = [
		"action" => "query",
		"meta" => "tokens",
		"type" => "login",
		"format" => "json"
	];

	$url = $endPoint . "?" . http_build_query( $params1 );

	$ch = curl_init( $url );
	curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
	curl_setopt( $ch, CURLOPT_COOKIEJAR, "cookie.txt" );
	curl_setopt( $ch, CURLOPT_COOKIEFILE, "cookie.txt" );

	$output = curl_exec( $ch );
	curl_close( $ch );

	$result = json_decode( $output, true );
	return $result["query"]["tokens"]["logintoken"];
}

// Step 2: POST request to log in. Use of main account for login is not
// supported. Obtain credentials via Special:BotPasswords
// (https://www.mediawiki.org/wiki/Special:BotPasswords) for lgname & lgpassword
function loginRequest( $logintoken ) {
	global $endPoint;

	$params2 = [
		"action" => "clientlogin",
		"username" => "username",
		"password" => "password",
		'loginreturnurl' => 'http://127.0.0.1:5000/',
		"logintoken" => $logintoken,
		"format" => "json"
	];

	$ch = curl_init();

	curl_setopt( $ch, CURLOPT_URL, $endPoint );
	curl_setopt( $ch, CURLOPT_POST, true );
	curl_setopt( $ch, CURLOPT_POSTFIELDS, http_build_query( $params2 ) );
	curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
	curl_setopt( $ch, CURLOPT_COOKIEJAR, "cookie.txt" );
	curl_setopt( $ch, CURLOPT_COOKIEFILE, "cookie.txt" );

	$output = curl_exec( $ch );
	curl_close( $ch );
}

// Step 3: GET request to fetch CSRF token
function getCSRFToken() {
	global $endPoint;

	$params3 = [
		"action" => "query",
		"meta" => "tokens",
		"format" => "json"
	];

	$url = $endPoint . "?" . http_build_query( $params3 );

	$ch = curl_init( $url );

	curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
	curl_setopt( $ch, CURLOPT_COOKIEJAR, "cookie.txt" );
	curl_setopt( $ch, CURLOPT_COOKIEFILE, "cookie.txt" );

	$output = curl_exec( $ch );
	curl_close( $ch );

	$result = json_decode( $output, true );
	return $result["query"]["tokens"]["csrftoken"];
}

// Step 4: POST request to import page from another wiki
function import( $csrftoken ) {
	global $endPoint;

	$params4 = [
		"action" => "import",
		"interwikisource" => "wikipedia:en",
		"interwikipage" => "Pragyan (rover)",
		"namespace" => "0",
		"fullhistory" => "true",
		"token" => $csrftoken,
		"format" => "json"
	];

	$ch = curl_init();

	curl_setopt( $ch, CURLOPT_URL, $endPoint );
	curl_setopt( $ch, CURLOPT_POST, true );
	curl_setopt( $ch, CURLOPT_POSTFIELDS, http_build_query( $params4 ) );
	curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
	curl_setopt( $ch, CURLOPT_COOKIEJAR, "cookie.txt" );
	curl_setopt( $ch, CURLOPT_COOKIEFILE, "cookie.txt" );

	$output = curl_exec( $ch );
	curl_close( $ch );

	echo ( $output );
}

JavaScript

[edit]
/*  
    import_interwiki.js
 
    MediaWiki API Demos
    Demo of `Import` module: Import a page from another wiki by
	specifying its title

    MIT license
*/

var request = require('request').defaults({jar: true}),
    url = "http://dev.wiki.local.wmftest.net:8080/w/api.php";

// Step 1: GET request to fetch login token
function getLoginToken() {
    var params_0 = {
        action: "query",
        meta: "tokens",
        type: "login",
        format: "json"
    };

    request.get({ url: url, qs: params_0 }, function (error, res, body) {
        if (error) {
            return;
        }
        var data = JSON.parse(body);
        loginRequest(data.query.tokens.logintoken);
    });
}

// Step 2: POST request to log in. 
// Use of main account for login is not
// supported. Obtain credentials via Special:BotPasswords
// (https://www.mediawiki.org/wiki/Special:BotPasswords) for lgname & lgpassword
function loginRequest(login_token) {
    var params_1 = {
        action: "clientlogin",
        username: "username",
        password: "password",
        loginreturnurl: "http://127.0.0.1:5000/",
        logintoken: login_token,
        format: "json"
    };

    request.post({ url: url, form: params_1 }, function (error, res, body) {
        if (error) {
            return;
        }
        getCsrfToken();
    });
}

// Step 3: GET request to fetch CSRF token
function getCsrfToken() {
    var params_2 = {
        action: "query",
        meta: "tokens",
        format: "json"
    };

    request.get({ url: url, qs: params_2 }, function(error, res, body) {
        if (error) {
            return;
        }
        var data = JSON.parse(body);
        import_interwiki(data.query.tokens.csrftoken);
    });
}

// Step 4: POST request to import page from another wiki
function import_interwiki(csrf_token) {
    var params_3 = {
        action: "import",
        interwikisource: "wikipedia:en",
        interwikipage: "Pragyan (rover)",
        namespace: "0",
        fullhistory: "true",
        token: csrf_token,
        format: "json"
    };

    request.post({ url: url, form: params_3 }, function (error, res, body) {
        if (error) {
            return;
        }
        console.log(body);
    });
}

// Start From Step 1
getLoginToken();

MediaWiki JS

[edit]
/*
	import_interwiki.js

	MediaWiki API Demos
	Demo of `Import` module: Import a page from another wiki by
    specifying its title

	MIT License
*/

var params = {
		action: 'import',
		interwikisource: 'en:w',
		interwikipage: 'Template:!-',
		fullhistory: 'true',
		namespace: '100',
		format: 'json'
	},
	api = new mw.Api();

api.postWithToken( 'csrf', params ).done( function ( data ) {
	console.log( data );
} );

Example 2: Import a page by uploading its xml dump

[edit]

POST request

[edit]

Import Help:Extension:ParserFunctions by uploading its xml dump obtained from Special:Export.

When uploading a file, you need to use multipart/form-data as Content-Type or enctype, application/x-www-form-urlencoded will not work.

The parameter xml is not a file name, but the actual content of a file.

Response

[edit]
Response
{
  "import": [
    {
      "ns": 12, 
      "title": "Help:ParserFunctions",
      "revisions": 639
    }
  ]
}

Sample Code 2

[edit]

Python

[edit]
#!/usr/bin/python3

"""
    import_xml.py

    MediaWiki Action API Code Samples
    Demo of `Import` module: Import a page from another wiki
    by uploading its xml dump
    MIT license
"""

import requests

S = requests.Session()

URL = "https://test.wikipedia.org/w/api.php"
FILE_PATH = '/path/to/your/file.xml'

# Step 1: Retrieve a login token
PARAMS_1 = {
    "action": "query",
    "meta": "tokens",
    "type": "login",
    "format": "json"
}

R = S.get(url=URL, params=PARAMS_1)
DATA = R.json()

LOGIN_TOKEN = DATA['query']['tokens']['logintoken']

# Step 2: Send a post request to log in using the clientlogin method.
# importupload rights can't be granted using Special:BotPasswords
# hence using bot passwords may not work.
# See https://www.mediawiki.org/wiki/API:Login for more
# information on log in methods.
PARAMS_2 = {
    "action":"clientlogin",
    "username":"username",
    "password":"password",
    'loginreturnurl': 'http://127.0.0.1:5000/',
    "format":"json",
    "logintoken":LOGIN_TOKEN
}

R = S.post(URL, data=PARAMS_2)

# Step 3: While logged in, retrieve a CSRF token
PARAMS_3 = {
    "action": "query",
    "meta": "tokens",
    "format": "json"
}

R = S.get(url=URL, params=PARAMS_3)
DATA = R.json()

CSRF_TOKEN = DATA['query']['tokens']['csrftoken']

# Step 4: Post request to upload xml dump.
# xml dumps can be downloaded through Special:Export
# See https://www.mediawiki.org/wiki/Special:Export
PARAMS_4 = {
    "action": "import",
    "format": "json",
    "token": CSRF_TOKEN,
    "interwikiprefix": "meta"
}

FILE = {'xml':('file.xml', open(FILE_PATH))}

R = S.post(url=URL, files=FILE, data=PARAMS_4)
DATA = R.json()

print(DATA)

JavaScript

[edit]
/*  
    import_xml.js
 
    MediaWiki API Demos
    Demo of `Import` module: Import a page from another wiki
    by uploading its xml dump

    MIT license
*/

var fs = require('fs'),
    request = require('request').defaults({jar: true}),
    url = "http://dev.wiki.local.wmftest.net:8080/w/api.php";

// Step 1: GET request to fetch login token
function getLoginToken() {
    var params_0 = {
        action: "query",
        meta: "tokens",
        type: "login",
        format: "json"
    };

    request.get({ url: url, qs: params_0 }, function (error, res, body) {
        if (error) {
            return;
        }
        var data = JSON.parse(body);
        loginRequest(data.query.tokens.logintoken);
    });
}

// Step 2: POST request to log in. 
// Use of main account for login is not
// supported. Obtain credentials via Special:BotPasswords
// (https://www.mediawiki.org/wiki/Special:BotPasswords) for lgname & lgpassword
function loginRequest(login_token) {
    var params_1 = {
        action: "clientlogin",
        username: "username",
        password: "password",
        loginreturnurl: "http://127.0.0.1:5000/",
        logintoken: login_token,
        format: "json"
    };

    request.post({ url: url, form: params_1 }, function (error, res, body) {
        if (error) {
            return;
        }
        getCsrfToken();
    });
}

// Step 3: GET request to fetch CSRF token
function getCsrfToken() {
    var params_2 = {
        action: "query",
        meta: "tokens",
        format: "json"
    };

    request.get({ url: url, qs: params_2 }, function(error, res, body) {
        if (error) {
            return;
        }
        var data = JSON.parse(body);
        import_xml(data.query.tokens.csrftoken);
    });
}

// Step 4: POST request to upload xml dump.
// xml dumps can be downloaded through Special:Export
// See https://www.mediawiki.org/wiki/Special:Export
function import_xml(csrf_token) {
    var params_3 = {
        action: "import",
        interwikiprefix: "en",
        token: csrf_token,
        format: "json"
    };

    var file = {
        xml: fs.createReadStream('a.xml')
    };

    var formData = Object.assign( {}, params_3, file );

    request.post({ url: url, formData: formData }, function (error, res, body) {
        if (error) {
            return;
        }
        console.log(body);
    });
}

// Start From Step 1
getLoginToken();

Possible errors

[edit]

In addition to the standard error messages :

Code Info
notoken The token parameter must be set.
cantimport You don't have permission to import pages.
cantimport-upload You don't have permission to import uploaded pages.
nointerwikipage The interwikipage parameter must be set.
nofile You didn't upload a file
filetoobig The file you uploaded is bigger than the maximum upload size
partialupload The file was only partially uploaded
notempdir The temporary upload directory is missing
This generally means the server is broken or misconfigured
cantopenfile Couldn't open the uploaded file
This generally means the server is broken or misconfigured
badinterwiki Invalid interwiki title specified
import-unknownerror Unknown error on import: error.

Parameter history

[edit]
  • v1.29: Introduced tags
  • v1.20: Introduced rootpage

Additional notes

[edit]
  • This module cannot be used as a generator .
  • importupload rights are required in order to upload an xml file, while import rights are required for interwiki imports.
  • If you get a Missing boundary in multipart/form-data POST data error, it is because you sent it url-encoded but claimed it would be multipart/form-data. MediaWiki is looking for a boundary in the header but cannot find it.
  • Parameters marked with upload are only used when importing an uploaded XML file. Similarly, parameters marked with interwiki are only used when importing from another wiki (transwiki).
  • The possible values for the interwikisource parameter differ per wiki, see Manual:$wgImportSources . If the list of possible values for this parameter is empty, interwiki imports are disabled.

See also

[edit]