BDTD file importer #2852

felipeaf · 2022-07-16T05:22:41Z

Import search results from Biblioteca digital brasileira de teses e dissertações (BDTD; Brazilian digital library of theses and dissertations; https://bdtd.ibict.br), exported as JSON.

Import search results from Biblioteca digital brasileira de teses e dissertações (BDTD; Brazilian digital library of theses and dissertations; https://bdtd.ibict.br)

AbeJellinek · 2022-07-26T15:29:46Z

Thanks, a couple things:

I don't think this should be an import translator at all. Just keep the web part and make the JSON translation routine internal. We don't normally have import translators for specific sites' formats without good justification.
The file is huge! Almost 31,000 lines and over a megabyte in size? Please trim down the test cases - there's no way we need that much.

AbeJellinek · 2022-07-26T20:40:31Z

OK, looks like the huge file size is coming from the second test. A few issues contributing:

We need to call selectItems() on search results pages. The translator shouldn't be importing every single search result without giving the user a chance to choose.
It should allow you to select from among the search results visible on the current page. So if I'm on a page with 10 search results displayed, I should get ten options in the selectItems() dialog. Right now it automatically imports every single search result on every single page of the search (hence the gigantic file size).

felipeaf · 2022-07-26T21:59:18Z

Hi! Actually, the last commit was a mistake. I didn't finish the web translator, when i did the pull request i had the file importer working and a not so big test file. I will try to finish the web part soon.
About the import translator, i can change if it's not relevant to Zotero. But i will give some context about this site: is a service maintained by the Brazilian government, and includes theses from several Brazilian universities.

This reverts commit 29cc7ce.

felipeaf · 2022-07-27T03:33:24Z

Ok, i reverted the last commit, because was a mistake push that to master, so now the head has just a small test case and a file importer that works.

About the web importer, i said before that i would finish that soon, but i didn't knew how web importers workers, and now i see that this json importer that i did is not useful to import just a page. This JSON format is one of 2 formats options that the user can download the full search result (the other is CSV, both looks non standard), by clicking in an "export" button. It's a download option for the user, not a JSON used in the page itself. To import just a page would be better parse the HTML DOM instead.

But it seems me a different use case. The user can download and import a full search result in order to make a systematic literature review. As I said, BDTD is maintained by the government and i guess it has some relevance for Brazilian researchers, because all master and doctoral thesis from a lot of brazilian universities are there.

AbeJellinek · 2022-07-27T13:48:56Z

This should be a web translator and should support both search results and individual item pages. A site's size isn't an argument for making its JSON schema into an import translator - if that schema isn't used by more than just one site or as an interchange format, there's no point.

I'd imagine there's a way to export a single item as JSON, even if it's not exposed in the frontend - is there not?

felipeaf · 2022-07-27T17:40:09Z

Hi! I've found a way to that JSON only of the page visible items, and i'll try finish the web translator, in the right way, later, but i have a problem. Zotero already has some web translator that partially works with BDTD. It's importing only title, authors and year, but there is a lot of data missing (including abstract and tags). I checked there is no reference to BDTD URLs in the repository at all, but i guess that something works because BDTD site is a instance of VuFind software that is used a lot in this kind of site. The problem is I don't know how i check which translator is doing that and if can do a more specific one.

* Actually, that JSON link is an API from vufind (check https://vufind.org/wiki/development:apis:search#search_api). I don't know if it can be reused with others VuFind based sites. That JSON is too weird to be a standard, it should be based in some local setting.

This reverts commit 784e17a.

AbeJellinek · 2022-08-02T17:30:50Z

That JSON is too weird to be a standard, it should be based in some local setting.

Why? It seems like a standard feature according to that page...

What isn't working well with the current VuFind translator? Can we just add some fixes there instead of writing a new one with a different API?

felipeaf and others added 4 commits July 16, 2022 02:20

BDTD file importer

99d6c2c

Import search results from Biblioteca digital brasileira de teses e dissertações (BDTD; Brazilian digital library of theses and dissertations; https://bdtd.ibict.br)

typo bug in date

2e00dd4

fix test

6856d92

implementing web part of btdt translator

29cc7ce

Revert "implementing web part of btdt translator"

784e17a

This reverts commit 29cc7ce.

Felipe Ferreira added 3 commits July 27, 2022 20:40

Revert "Revert "implementing web part of btdt translator""

01e6d83

This reverts commit 784e17a.

fixing web translator, calling selectItems

91663df

web importer for individual page

5ba67eb

This was referenced Aug 24, 2022

add web translator for Index Theologicus and Relbib #2873

Open

Add translator for VuFind #2874

Draft

adam3smith mentioned this pull request Oct 24, 2023

Add Library Catalog (VuFind).js #2969

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BDTD file importer #2852

BDTD file importer #2852

felipeaf commented Jul 16, 2022

AbeJellinek commented Jul 26, 2022

AbeJellinek commented Jul 26, 2022 •

edited

Loading

felipeaf commented Jul 26, 2022

felipeaf commented Jul 27, 2022

AbeJellinek commented Jul 27, 2022

felipeaf commented Jul 27, 2022 •

edited

Loading

AbeJellinek commented Aug 2, 2022

BDTD file importer #2852

Are you sure you want to change the base?

BDTD file importer #2852

Conversation

felipeaf commented Jul 16, 2022

AbeJellinek commented Jul 26, 2022

AbeJellinek commented Jul 26, 2022 • edited Loading

felipeaf commented Jul 26, 2022

felipeaf commented Jul 27, 2022

AbeJellinek commented Jul 27, 2022

felipeaf commented Jul 27, 2022 • edited Loading

AbeJellinek commented Aug 2, 2022

AbeJellinek commented Jul 26, 2022 •

edited

Loading

felipeaf commented Jul 27, 2022 •

edited

Loading