Get the number of documents or features in an object.
ndoc(x)
nfeat(x)
a quanteda object: a corpus, dfm, tokens, or tokens_xptr object, or a readtext object from the readtext package
ndoc()
returns an integer count of the number of documents in an
object whose texts are organized as "documents" (a corpus, dfm, or
tokens/tokens_xptr object.
nfeat()
returns an integer count of the number of features. It is
an alias for ntype()
for a dfm. This function is only defined for dfm
objects because only these have "features".
# number of documents
ndoc(data_corpus_inaugural)
#> [1] 59
ndoc(corpus_subset(data_corpus_inaugural, Year > 1980))
#> [1] 11
ndoc(tokens(data_corpus_inaugural))
#> [1] 59
ndoc(dfm(tokens(corpus_subset(data_corpus_inaugural, Year > 1980))))
#> [1] 11
# number of features
toks1 <- tokens(corpus_subset(data_corpus_inaugural, Year > 1980), remove_punct = FALSE)
toks2 <- tokens(corpus_subset(data_corpus_inaugural, Year > 1980), remove_punct = TRUE)
nfeat(dfm(toks1))
#> [1] 3426
nfeat(dfm(toks2))
#> [1] 3410