Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translator for CONTENTdm archive and library database software #968

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

emmareisz
Copy link

Translator for CONTENTdm archive and library database software, v6 with limited support for v4 and v5.

"target": "/cdm/|/cdm4/",
"minVersion": "3.0.9",
"maxVersion": "",
"priority": 100,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be 270 as per #765

header = headers[ j ].textContent;
header = camelize( header );
}
if ( contents[ i ].textContent ) content = contents[ i ].textContent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if it's not? (content keeps the value it had in previous loop)

(as does header, above)

@aurimasv
Copy link
Contributor

aurimasv commented Nov 3, 2015

Please address the comments above to start and run the code through http://jsbeautifier.org/ using tabs for indenting. Looks like there will be further comments after that.

"items": [
{
"itemType": "manuscript",
"title": "MS.15.1.2.000. Sir Robert Hart Diary: Volume 02: February 1855-July 1855",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like the archive location is prepended to the title. Could we clean that up better?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two points here. 1) In this particular catalogue record, 'MS.15.1.2.000' is given as the item title, hence it is scraped by the translator. On this occasion, the item title is redundant, but in many cases it isn't.

  1. The slightly awkward format is because CDM 6 has both object-level data and item-level data. These are implemented inconsistently by libraries, and it is not possible to say that either can be safely omitted. Because of the way CDM displays, libraries often choose to put essential information at the Object level, and (as in this example) simply dropping the object-level data if the item data is present would lead to some uninformative scrapes. Hence the translator combines the item information and the object level data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks for me that MS.15.1.2.000 is just a single page and the diary contains several pages, e.g. MS.15.1.2.005 stands for page number 5.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly.

On 3 Nov 2015, at 12:05, Philipp Zumstein [email protected] wrote:

In CONTENTdm.js:

  • } else {
  • scrape( doc, url );
  • }
    +}

+/** BEGIN TEST CASES **/
+var testCases = [

  • {
  •   "type": "web",
    
  •   "url": "http://cdm15979.contentdm.oclc.org/cdm/compoundobject/collection/p15979coll3/id/2419",
    
  •   "items": [
    
  •       {
    
  •           "itemType": "manuscript",
    
  •           "title": "MS.15.1.2.000. Sir Robert Hart Diary: Volume 02: February 1855-July 1855",
    
    It looks for me that MS.15.1.2.000 is just a single page and the diary contains several pages, e.g. MS.15.1.2.005 stands for page number 5.


Reply to this email directly or view it on GitHub.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, great. Leave as is, then; one less thing to fix.

@zuphilip zuphilip added the New Translator Pull requests for new translators label Nov 12, 2017
@adam3smith
Copy link
Collaborator

@emmareisz I think the needed changes are fairly small -- are you interested in finishing this translator up still?

@emmareisz
Copy link
Author

Hi @adam3smith - yes, I've been using the CONTENTdm translator and sharing it with colleagues, so I'd like to get it incorporated into Zotero. CONTENTdm's also had an update in the meantime. I should have some time over Christmas to finish it.

I'd like to get the BNA translator finished first, though, if that's ok - could you take a quick look at it and then I'll do a PR?

@adam3smith
Copy link
Collaborator

adam3smith commented Nov 22, 2017

Yes of course -- I left a comment there.

@AbeJellinek
Copy link
Member

@emmareisz, any progress on this or the BNA translator? I would love to get all this merged!

@emmareisz
Copy link
Author

Just pottering along using them privately! I'll email you @AbeJellinek

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Translator Pull requests for new translators
Development

Successfully merging this pull request may close these issues.

None yet

5 participants