Get the number of documents or features in an object.
ndoc(x) nfeat(x)
| x | a quanteda object: a corpus, dfm, or tokens object, or a readtext object from the readtext package. |
|---|
an integer (count) of the number of documents or features
ndoc returns the number of documents in an object
whose texts are organized as "documents" (a corpus,
dfm, or tokens object, a readtext object from the
readtext package).
nfeat returns the number of features from a dfm; it is an
alias for ntype when applied to dfm objects. This function is only
defined for dfm objects because only these have "features". (To count
tokens, see ntoken().)
# number of documents ndoc(data_corpus_inaugural) #> [1] 59 ndoc(corpus_subset(data_corpus_inaugural, Year > 1980)) #> [1] 11 ndoc(tokens(data_corpus_inaugural)) #> [1] 59 ndoc(dfm(tokens(corpus_subset(data_corpus_inaugural, Year > 1980)))) #> [1] 11 # number of features toks1 <- tokens(corpus_subset(data_corpus_inaugural, Year > 1980), remove_punct = FALSE) toks2 <- tokens(corpus_subset(data_corpus_inaugural, Year > 1980), remove_punct = TRUE) nfeat(dfm(toks1)) #> [1] 3426 nfeat(dfm(toks2)) #> [1] 3412