Now that we're losing access to Yahoo's index, we need to migrate existing copyvio tools to using a different API.
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T116957 Plagiarism detection tools for text (tracking) | |||
Resolved | kaldari | T131169 Help CorenBot migrate to a new API | |||
Resolved | kaldari | T125459 Investigation: Can we find a new search API for CorenSearchBot and Copyvio Detector tool? | |||
Resolved | bd808 | T132943 Create an API request proxy so that Tool Labs tools can access the Yandex API from a single IP address | |||
Resolved | bd808 | T132950 Create project Yandex-proxy | |||
Resolved | yuvipanda | T132982 Static IP for yandex-proxy01.yandex-proxy.eqiad.wmflabs |
Event Timeline
Comment Actions
The authorization for the new API is done through a basic authorization header passed as part of the API request.
In PHP this would be implemented something like:
$accountKey = 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'; $webSearchURL = 'https://api.datamarket.azure.com/Bing/Search/Web?$format=json&Query='; $context = stream_context_create(array( 'http' => array( 'request_fulluri' => true, 'header' => "Authorization: Basic " . base64_encode($accountKey . ":" . $accountKey) ) )); $request = $webSearchURL . urlencode( '\'' . $_POST["searchText"] . '\''); $response = file_get_contents($request, 0, $context); $jsonobj = json_decode($response);
@coren: Let me know what I can do to help. Do you have the code in a version control system?
Comment Actions
The new Google Search API is now available for use from Tool Labs.
API documentation:
https://developers.google.com/custom-search/json-api/v1/using_rest#making_a_request
Proxy documentation:
https://wikitech.wikimedia.org/wiki/Nova_Resource:Google-api-proxy
Ping me to get the key and cx value.