Quick Links

PubMed XML Import Error

I have been receiving the following error message when trying to import an XML file of a PubMed Search (with 1784 references) into Zotero:

"an error occurred while trying to import the selected file. Please ensure that the file is valid and try again."

I have tried downloading the XML from three different browsers (Chrome, Safari, and Firefox), have updated and restarted Zotero, have cleared my browser cache, have ensured I have ample storage available in Zotero, and still receive the exact same message.

I am conducting a systematic review, so I am hesitant to pare down the number of papers until I can bring in the full searches from each of my main databases, remove duplicates, and then remove studies based on my inclusion/exclusion criteria. However, if there is no other workaround, I can try filtering my searches in PubMed by title to then import only the titled entries (and then review the others separately for inclusion outside of Zotero). Does this sound like my best path forward?

Right, of course you want the complete import -- I was suggesting you try to find out where Zotero import fails. Just looking for titled entries and then handling non-titled ones separately seems like a good strategy. Most non-titled entries should have titles (in square brackets) in Pubmed, so many of those will import into Zotero, too.

Sure, here is the search (which should also be sorted by Best Match when saving the XML):

Search (((((((Food preferences[mesh] OR health knowledge, attitudes, practice[MeSH Terms] OR knowledge[tiab] OR "well-being"[tiab] OR "well being"[tiab] OR (food and (habits[tiab] OR habit[tiab] OR choice[tiab] OR choices[tiab] OR behavior[tiab] OR behaviors[tiab] OR preferences[tiab] OR preference[tiab] OR attitudes[tiab] OR attitude[tiab])) OR outcome[tiab] OR outcomes[tiab] OR measure[tiab] OR measures[tiab] OR outcome assessment health care[MeSH Terms])) AND (cooking[MeSH Terms] OR culinary[Title/Abstract] OR cooking[Title/Abstract] OR cook[Title/Abstract] OR “food preparation”[Title/Abstract] OR “meal preparation”[tiab])))) AND (schools[Title/Abstract] OR school[Title/Abstract] OR schools[mesh] OR elementary[tiab] OR “high-school”[tiab] OR “high school”[tiab] OR youth[Title/Abstract] OR teenager[Title/Abstract] OR teen[Title/Abstract] OR teens[Title/Abstract] OR adolescent[Title/Abstract] OR adolescents[Title/Abstract] OR adolescence[tiab] OR pupil[Title/Abstract] OR pupils[Title/Abstract] OR student[Title/Abstract] OR students[Title/Abstract] OR child[Title/Abstract] OR children[Title/Abstract] OR youth[MeSH Terms] OR adolescent[MeSH Terms] OR child[MeSH Terms] OR minor[tiab] OR minors[tiab] OR minor[mesh])) AND ( "1980/01/01"[PDat] : "2017/11/31"[PDat] )) AND ( "1980/01/01"[PDat] : "2017/11/31"[PDat] )) Sort by: Title Filters: Publication date from 1980/01/01 to 2017/11/31

I'm trying to paste this (((((((Food preferences[mesh] OR health knowledge, attitudes, practice[MeSH Terms] OR knowledge[tiab] OR "well-being"[tiab] OR "well being"[tiab] OR (food and (habits[tiab] OR habit[tiab] OR choice[tiab] OR choices[tiab] OR behavior[tiab] OR behaviors[tiab] OR preferences[tiab] OR preference[tiab] OR attitudes[tiab] OR attitude[tiab])) OR outcome[tiab] OR outcomes[tiab] OR measure[tiab] OR measures[tiab] OR outcome assessment health care[MeSH Terms])) AND (cooking[MeSH Terms] OR culinary[Title/Abstract] OR cooking[Title/Abstract] OR cook[Title/Abstract] OR “food preparation”[Title/Abstract] OR “meal preparation”[tiab])))) AND (schools[Title/Abstract] OR school[Title/Abstract] OR schools[mesh] OR elementary[tiab] OR “high-school”[tiab] OR “high school”[tiab] OR youth[Title/Abstract] OR teenager[Title/Abstract] OR teen[Title/Abstract] OR teens[Title/Abstract] OR adolescent[Title/Abstract] OR adolescents[Title/Abstract] OR adolescence[tiab] OR pupil[Title/Abstract] OR pupils[Title/Abstract] OR student[Title/Abstract] OR students[Title/Abstract] OR child[Title/Abstract] OR children[Title/Abstract] OR youth[MeSH Terms] OR adolescent[MeSH Terms] OR child[MeSH Terms] OR minor[tiab] OR minors[tiab] OR minor[mesh])) AND ( "1980/01/01"[PDat] : "2017/11/31"[PDat] )) AND ( "1980/01/01"[PDat] : "2017/11/31"[PDat] ))

I have been exporting them as a XML file, and then selecting import through the Zotero drop down menu and selecting that XML file. I am now exporting in batches of 200 through the "export to citation manager" feature in PubMed, which is tedious but seems to be working okay.

Still no idea why the XML route is not working for me -- perhaps a flash memory issue with my laptop?

I select the "send to" drop down on the PubMed search results page, then "file" as the destination, then XML as the format and Best Match as the sort style. I have not seen any options for different XML formats - is there another way to export the search results?

I get the same error for a completely different search. I tried XML (& txt & CSV that I re-saved to xml and various other tinkering) versions. All failed to import. Zotero created a folder/collection and seemed to think about it for a period, showed a red importing box, but in the end failed every time.I re-ordered the items by title and examined the first and last pages - no missing titles.

I'll take another look. One thing to try: open the .xml file in a text editor, select all, copy and then use "import from clipboard" in Zotero. Note that this opens into your currently selected collection.

FWIW In my experience this occurs with the PubMed XML of journals that publish articles in more than one language. English language articles sometimes have an empty title field but have the English title instead within the vernacular title tags. I'll provide an example of the flawed XML the next time I encounter this. I apologize for the confused wording above. Things will become quite clear when I can provide an example.

I tried "import from clipboard" - no good (I copied from the XML file, a plain text file, & tab-delimited.)I set "English only" as a pubmed filter before I exported the articles. (And I'm most of the way through screening them and haven't yet come across one where the title is in another language or screwed up in any other way.)

I tried displaying 200 citations in pubmed on a webpage and tried the browser capture function. I select and the little red box appears and the list appears in it as one would expect. Then at the bottom of the list is the same error message and it doesn't import. That happens regardless of whether it's 1 item or 200 items selected. And there's nothing wrong with the titles, and all in English. (Trying the same thing with googlescholar DOES work). If I have a complete citation on the page rather than the list, Zotero DOES capture it, though not entirely faithfully.

I'm using Zotero on a Mac, but that shouldn't matter. It updated within the last 24 hours or so. And I've restarted both the machine & Zotero.

Does this happen regardless of search terms?If so, could we get debug output from Zotero for a failed attempt to import an XML file that is as small as possible?

For example: 1. Search for test2. Choose to only show five search results3. Select Format --> XML4. Select all + copy5. Turn on debug ouput in Zotero (note we want this from Zotero, not the connector)6. Import from clipboard --> assuming this fails7. Submit Debug8. Post debug ID here

I skim-read the debug output and saw the term "foreign". So I skimmed the xml file. None of the papers are non-English, but one of the Journals is called "Le journal canadien des sciences neurologiques". It's not terribly uncommon for Canadian Journals to have French names and publish both English and French papers. Might that be the problem? (But plenty of Journals have names that aren't proper English words, especially if you include the names listed in abbreviated form.)The debug code is D1197405201.

Interestingly, there was a Zotero security update yesterday. After installing and re-starting Zotero it DID import an xml file with 10 references from PubMed, but still WON'T import one with 100 citations.

(There wasn't actually a security update, just a normal update. If the updater is saying that, that's a mistake. The updater we were using used to call them that, but I thought we fixed that… The update is unlikely to have had any effect on this, in any case.)