| Title: | 'Pubmed' Word Clouds | 
| Description: | Create a word cloud using the abstract of publications from 'Pubmed'. | 
| Version: | 0.3.6 | 
| Date: | 2019-02-28 | 
| Author: | Felix Yanhui Fan <nolanfyh@gmail.com> | 
| Imports: | XML, stringr, RCurl, wordcloud, tm, RColorBrewer | 
| Maintainer: | Felix Yanhui Fan <nolanfyh@gmail.com> | 
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] | 
| URL: | http://felixfan.github.io/PubMedWordcloud/ | 
| RoxygenNote: | 6.0.1 | 
| NeedsCompilation: | no | 
| Packaged: | 2019-03-01 02:04:15 UTC; alicefelix | 
| Repository: | CRAN | 
| Date/Publication: | 2019-03-01 05:30:07 UTC | 
clean data
Description
remove Punctuations, remove Numbers, Translate characters to lower or upper case, remove stopwords, remove user specified words, Stemming words.
Usage
cleanAbstracts(abstracts, rmNum = TRUE, tolw = TRUE, toup = FALSE,
  rmWords = TRUE, yrWords = NULL, stemDoc = FALSE)
Arguments
| abstracts | output of getAbstracts, or just a paragraph of text | 
| rmNum | Remove the text document with any numbers in it or not | 
| tolw | Translate characters in character vectors to lower case or not | 
| toup | Translate characters in character vectors to upper case or not | 
| rmWords | Remove a set of English stopwords (e.g., 'the') or not | 
| yrWords | A character vector listing the words to be removed. | 
| stemDoc | Stem words in a text document using Porter's stemming algorithm. | 
See Also
Examples
# Abs=getAbstracts(c("22693232", "22564732"))
# cleanAbs=cleanAbstracts(Abs)
# text="Jobs received a number of honors and public recognition."
# cleanD=cleanAbstracts(text)
plot colors
Description
plot colors.
Usage
colSets(type)
Arguments
| type | palette names from the lists: Accent, Dark2, Pastel1, Pastel2, Paired, Set1, Set2, Set3. | 
Examples
# colors= colSets(type="Accent")
# colors= colSets(type="Paired")
# colors= colSets(type="Set3")
edit PMIDs
Description
add two sets of PMIDs together, or exclude one set PMIDs from another set of PMIDs.
Usage
editPMIDs(x, y, method = c("add", "exclude"))
Arguments
| x | output of getPMIDs, or a set of PMIDs | 
| y | output of getPMIDs, or a set of PMIDs | 
| method | can be 'add' (default) or 'exclude'. see details. | 
Details
when method is 'add', PMIDs in 'x' and 'y' will be combined. when method is 'exclude', PMIDs in 'y' will be excluded from 'x'.
See Also
Examples
# pmid1=getPMIDs(author="Yan-Hui Fan",dFrom=2007,dTo=2013,n=10)
# rm1="22698742"
# pmids1=editPMIDs(x=pmid1,y=rm1,method="exclude")
# pmid2=getPMIDs(author="Yanhui Fan",dFrom=2007,dTo=2013,n=10)
# rm2="20576513"
# pmids2=editPMIDs(x=pmid2,y=rm2,method="exclude")
# pmids=editPMIDs(x=pmids1,y=pmids2,method="add")
get Abstracts
Description
retrieve abstracts of the specified PMIDs from PubMed.
Usage
getAbstracts(pmid, https = TRUE, s = 100)
Arguments
| pmid | a set of PMIDs | 
| https | use https instead of http | 
| s | download how many PMIDs each time | 
See Also
Examples
# pmids=c("22693232", "22564732", "22301463", "22015308", "21283797", "19412437")
# abstracts=getAbstracts(pmids)
# pmid="22693232"
# abstract=getAbstracts(pmid)
# pmids=getPMIDs(author="Yan-Hui Fan",dFrom=2007,dTo=2013,n=10)
# abstracts=getAbstracts(pmids)
get PMIDs using author names
Description
retrieve PMIDs (each PMID is 8 digits long) from PubMed for author and the specified date.
Usage
getPMIDs(author, dFrom, dTo, n = 500, https = TRUE)
Arguments
| author | author's name | 
| dFrom | start year | 
| dTo | end year | 
| n | max number of retrieved articles | 
| https | use https instead of http | 
See Also
Examples
# getPMIDs(author="Yan-Hui Fan",dFrom=2007,dTo=2013,n=10)
# getPMIDs(author="Yanhui Fan",dFrom=2007,dTo=2013,n=10)
get PMIDs using Journal names and Keywords
Description
retrieve PMIDs (each PMID is 8 digits long) from PubMed for Specific Journal, Keywords and date.
Usage
getPMIDsByKeyWords(keys = NULL, journal = NULL, dFrom = NULL,
  dTo = NULL, n = 10000, https = TRUE)
Arguments
| keys | keywords | 
| journal | journal name | 
| dFrom | start year | 
| dTo | end year | 
| n | max number of retrieved articles | 
| https | use https instead of http | 
See Also
Examples
# getPMIDsByKeyWords(keys="breast cancer", journal="science",dTo=2013)
# getPMIDsByKeyWords(keys="breast cancer", journal="science")
# getPMIDsByKeyWords(keys="breast cancer",dFrom=2012,dTo=2013)
# getPMIDsByKeyWords(journal="science",dFrom=2012,dTo=2013)
PubMed wordcloud using function 'wordcloud' of package wordcloud
Description
PubMed wordcloud.
Usage
plotWordCloud(abs, scale = c(3, 0.3), min.freq = 1, max.words = 100,
  random.order = FALSE, rot.per = 0.35, use.r.layout = FALSE,
  colors = brewer.pal(8, "Dark2"))
Arguments
| abs | output of cleanAbstracts, or a data frame with one colume of 'word' and one colume of 'freq'. | 
| scale | A vector of length 2 indicating the range of the size of the words. | 
| min.freq | words with frequency below min.freq will not be plotted | 
| max.words | Maximum number of words to be plotted. least frequent terms dropped | 
| random.order | plot words in random order. If false, they will be plotted in decreasing frequency | 
| rot.per | proportion words with 90 degree rotation | 
| use.r.layout | if false, then c++ code is used for collision detection, otherwise R is used | 
| colors | color words from least to most frequent | 
Details
This function just call 'wordcloud' from package wordcloud. See package wordcloud for more details about the parameters.
Examples
# text="Jobs received a number of honors and public recognition." 
# cleanD=cleanAbstracts(text)
# plotWordCloud(cleanD,min.freq=1,scale=c(2,1))