需求:在使用 latex 寫論文的時候,你是否有這個需求,需要將引用轉換為 bibtex 格式,如果文獻量很大,這個重複工作實在不值得做,如果你實現使用了文獻管理工具,例如 endnote、zotero,可以一件導出,但是沒有的話,本文提供一個解決方案
方案:crossref API+google scholar API
crossref 是最大的外文 doi 發布平台,基本包含了所有的外文文獻的元數據,但是也有一些包括不限於 arXiv 等文獻是查詢不到的,這個時候需要 google scholar 幫忙
為了節省大家的時間,這兩個 api 我已經進行了封裝,只需要使用 pip 下載下來
pip install get_bibtex
之後可以按照下面的使用方法
from apiModels.get_bibtex_from_crossref import GetBibTex
from apiModels.get_bibtex_from_google_scholar import GetBibTexFromGoogleScholar
if __name__ == '__main__':
google_scholar_api_key = "your_google_scholar_api_key"
get_bibtex_from_crossref = GetBibTex("[email protected]")
get_bibtex_from_google_scholar = GetBibTexFromGoogleScholar(google_scholar_api_key, GetBibTexFromGoogleScholar.APA)
with open("inputfile/Bibliographyraw.txt", "r", encoding='utf-8') as f:
raws = f.readlines()
# get bibtex from CrossRef and failed search results
success_bibtexs_crossref, failed_results = get_bibtex_from_crossref.get_bibtexs(raws)
# for each failed search result, get bibtex from Google Scholar
success_bibtexs_google, failed_results = get_bibtex_from_google_scholar.get_bibtexs(failed_results)
with open("outputfile/BibliographyCrossRef.txt", "w", encoding='utf-8') as f:
for bibtex in success_bibtexs_crossref:
f.write(bibtex)
with open("outputfile/BibliographyGoogleScholar.txt", "w", encoding='utf-8') as f:
for index, bibtex in enumerate(success_bibtexs_google):
f.write("[]".format(index) + " " + bibtex + "\n")
with open("outputfile/not_find.txt", "w", encoding='utf-8') as f:
for result in failed_results:
f.write(result+"\n")
print("find bibtex from CrossRef: ", len(success_bibtexs_crossref))
print("find bibtex from Google Scholar: ", len(success_bibtexs_google))
print("not find: ", len(failed_results))
關鍵代碼解釋
Bibliographyraw.txt裡面是需要查詢的文件
例如:
J. Hu, L. Shen, S. Albanie, G. Sun, and A. Vedaldi, “Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks.” arXiv, Jan. 12, 2019. doi: 10.48550/arXiv.1810.12348.
X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local Neural Networks.” arXiv, Apr. 13, 2018. doi: 10.48550/arXiv.1711.07971.
------------------
success_bibtexs_crossref, failed_results = get_bibtex_from_crossref.get_bibtexs(raws)
返回的第一個參數為bibtex列表,第二個為沒有查詢到的原文獻
success_bibtexs_google, failed_results = get_bibtex_from_google_scholar.get_bibtexs(failed_results)
將沒有查到的文獻繼續使用google API查詢,一般都是可以查詢到的,沒有文獻在google scholar查詢不到了吧
提示,這裡返回的其實是APA格式的,在上面初始化指定,也有一個參數可以設置返回bibtex格式,例如
get_bibtex_from_google_scholar = GetBibTexFromGoogleScholar(google_scholar_api_key, GetBibTexFromGoogleScholar.APA, flag = True)
但是需要設置代理伺服器,例如:
import os
import re
import requests
os.environ["http_proxy"]="127.0.0.1:7890"
os.environ["https_proxy"]="127.0.0.1:7890"
!!!!!!!注意:需要先去申請 API,每個月有 100 的免費查詢次數,一般是夠用的,在 serpapi.com 申請
之後的代碼都一目了然了哈哈
當然也有請求單個查詢的:
get_bibtex() 去掉s就可以了
2024.4.16 更新#
1、添加了 DBLP 接口
from apiModels.get_bibtex_from_dblp import GetBibTexFromDBLP
2、提高使用便捷性
現在提供一個封裝好的類提供使用,這個方法已經封裝好了Crosref和DBLP的API
from apiModels.workflow.crossref2dblp import Crossref2Dblp
使用方法(無google scholar API)
crossref2dblp = Crossref2Dblp("your email", "inputfile/Bibliographyraw.txt", "outputfile/Bibliography.txt")
crossref2dblp.running()
坐等運行完成
(有了google scholar API)
from apiModels.workflow.crossref2dblp import Crossref2Dblp
from apiModels.get_bibtex_from_google_scholar import GetBibTexFromGoogleScholar
get_bibtex_from_google_scholar = GetBibTexFromGoogleScholar(api_key="your api key")
在最後面參數加上你封裝的API
crossref2dblp = Crossref2Dblp("[email protected]", "inputfile/Bibliographyraw.txt", "outputfile/Bibliography.txt",get_bibtex_from_google_scholar)
crossref2dblp.running()
坐等運行完成
或者你想自己定義api之間的調用順序
from apiModels.workflow.make_workflow import MakeWorkflow
from apiModels.get_bibtex_from_google_scholar import GetBibTexFromGoogleScholar
from apiModels.get_bibtex_from_crossref import GetBibTex
get_bibtex_from_google_scholar = GetBibTexFromGoogleScholar(api_key="your api key")
get_bibtex_from_crossref = GetBibTex("[email protected]")
make_workflow = MakeWorkflow("inputfile/Bibliographyraw.txt", "outputfile/Bibliography.txt", get_bibtex_from_google_scholar, get_bibtex_from_crossref)
make_workflow.running()
使用之前:
pip install get_bibtex = 1.1.0
歡迎改進
此文由 Mix Space 同步更新至 xLog
原始鏈接為 https://me.liuyaowen.club/posts/default/20240816and2