% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/freq.R
\name{freq}
\alias{freq}
\alias{freq.default}
\alias{freq.factor}
\alias{freq.matrix}
\alias{freq.table}
\alias{freq.numeric}
\alias{freq.Date}
\alias{freq.hms}
\alias{is.freq}
\alias{top_freq}
\alias{header}
\alias{print.freq}
\title{Frequency table}
\usage{
freq(x, ...)

\method{freq}{default}(
  x,
  sort.count = TRUE,
  nmax = getOption("max.print.freq"),
  na.rm = TRUE,
  row.names = TRUE,
  markdown = !interactive(),
  digits = 2,
  quote = NULL,
  header = TRUE,
  title = NULL,
  na = "<NA>",
  sep = " ",
  decimal.mark = getOption("OutDec"),
  big.mark = "",
  ...
)

\method{freq}{factor}(x, ..., droplevels = FALSE)

\method{freq}{matrix}(x, ..., quote = FALSE)

\method{freq}{table}(x, ..., sep = " ")

\method{freq}{numeric}(x, ..., digits = 2)

\method{freq}{Date}(x, ..., format = "yyyy-mm-dd")

\method{freq}{hms}(x, ..., format = "HH:MM:SS")

is.freq(f)

top_freq(f, n)

header(f, property = NULL)

\method{print}{freq}(
  x,
  nmax = getOption("max.print.freq", default = 10),
  markdown = !interactive(),
  header = TRUE,
  decimal.mark = getOption("OutDec"),
  big.mark = ifelse(decimal.mark != ",", ",", "."),
  ...
)
}
\arguments{
\item{x}{vector of any class or a \code{\link{data.frame}} or \code{\link{table}}}

\item{...}{up to nine different columns of \code{x} when \code{x} is a \code{data.frame} or \code{tibble}, to calculate frequencies from - see Examples. Also supports quasiquotion.}

\item{sort.count}{sort on count, i.e. frequencies. This will be \code{TRUE} at default for everything except when using grouping variables.}

\item{nmax}{number of row to print. The default, \code{10}, uses \code{\link{getOption}("max.print.freq")}. Use \code{nmax = 0}, \code{nmax = Inf}, \code{nmax = NULL} or \code{nmax = NA} to print all rows.}

\item{na.rm}{a logical value indicating whether \code{NA} values should be removed from the frequency table. The header (if set) will always print the amount of \code{NA}s.}

\item{row.names}{a logical value indicating whether row indices should be printed as \code{1:nrow(x)}}

\item{markdown}{a logical value indicating whether the frequency table should be printed in markdown format. This will print all rows (except when \code{nmax} is defined) and is default behaviour in non-interactive R sessions (like when knitting RMarkdown files).}

\item{digits}{how many significant digits are to be used for numeric values in the header (not for the items themselves, that depends on \code{\link{getOption}("digits")})}

\item{quote}{a logical value indicating whether or not strings should be printed with surrounding quotes. Default is to print them only around characters that are actually numeric values.}

\item{header}{a logical value indicating whether an informative header should be printed}

\item{title}{text to show above frequency table, at default to tries to coerce from the variables passed to \code{x}}

\item{na}{a character string that should be used to show empty (\code{NA}) values (only useful when \code{na.rm = FALSE})}

\item{sep}{a character string to separate the terms when selecting multiple columns}

\item{decimal.mark}{%
    used for prettying (longish) numerical and complex sequences.
    Passed to \code{\link[base]{prettyNum}}: that help page explains the details.}

\item{big.mark}{%
    used for prettying (longish) numerical and complex sequences.
    Passed to \code{\link[base]{prettyNum}}: that help page explains the details.}

\item{droplevels}{a logical value indicating whether in factors empty levels should be dropped}

\item{format}{a character to define the printing format (it supports \code{\link{format_datetime}} to transform e.g. \code{"d mmmm yyyy"} to \code{"\%e \%B \%Y"})}

\item{f}{a frequency table}

\item{n}{number of top \emph{n} items to return, use -n for the bottom \emph{n} items. It will include more than \code{n} rows if there are ties.}

\item{property}{property in header to return this value directly}
}
\value{
A \code{data.frame} (with an additional class \code{"freq"}) with five columns: \code{item}, \code{count}, \code{percent}, \code{cum_count} and \code{cum_percent}.
}
\description{
Create a frequency table of a \code{vector} or a \code{data.frame}. It supports tidyverse's quasiquotation and RMarkdown for reports. Easiest practice is: \code{data \%>\% freq(var)} using the \href{https://magrittr.tidyverse.org/#usage}{tidyverse}.

\code{top_freq} can be used to get the top/bottom \emph{n} items of a frequency table, with counts as names. It respects ties.
}
\details{
Frequency tables (or frequency distributions) are summaries of the distribution of values in a sample. With the `freq` function, you can create univariate frequency tables. Multiple variables will be pasted into one variable, so it forces a univariate distribution. 

Input can be done in many different ways. Base R methods are:
\preformatted{
freq(df$variable)
freq(df[, "variable"])
}

Tidyverse methods are:
\preformatted{
df$variable \%>\% freq()
df[, "variable"] \%>\% freq()
df \%>\% freq("variable")
df \%>\% freq(variable)
}

For numeric values of any class, these additional values will all be calculated with \code{na.rm = TRUE} and shown into the header:
\itemize{
  \item{Mean, using \code{\link[base]{mean}}}
  \item{Standard Deviation, using \code{\link[stats]{sd}}}
  \item{Coefficient of Variation (CV), the standard deviation divided by the mean}
  \item{Mean Absolute Deviation (MAD), using \code{\link[stats]{mad}}}
  \item{Tukey Five-Number Summaries (minimum, Q1, median, Q3, maximum), see \emph{NOTE} below}
  \item{Interquartile Range (IQR) calculated as \code{Q3 - Q1}, see \emph{NOTE} below}
  \item{Coefficient of Quartile Variation (CQV, sometimes called coefficient of dispersion) calculated as \code{(Q3 - Q1) / (Q3 + Q1)}, see \emph{NOTE} below}
  \item{Outliers (total count and percentage), using \code{\link[grDevices]{boxplot.stats}}}
}
\emph{NOTE}: These values are calculated using the same algorithm as used by Minitab and SPSS: \emph{p[k] = E[F(x[k])]}. See Type 6 on the \code{\link[stats]{quantile}} page.

For dates and times of any class, these additional values will be calculated with \code{na.rm = TRUE} and shown into the header:
\itemize{
  \item{Oldest, using \code{\link{min}}}
  \item{Newest, using \code{\link{max}}, with difference between newest and oldest}
}

In factors, all factor levels that are not existing in the input data will be dropped at default.

The function \code{top_freq} will include more than \code{n} rows if there are ties. Use a negative number for \emph{n} (like \code{n = -3}) to select the bottom \emph{n} values.
}
\section{Extending the \code{freq()} function}{

Interested in extending the \code{freq()} function with your own class? Add a method like below to your package, and optionally define some header info by passing a \code{\link{list}} to the \code{.add_header} parameter, like below example for class \code{difftime}. This example assumes that you use the \code{roxygen2} package for package development.
\preformatted{
#' @exportMethod freq.difftime
#' @importFrom cleaner freq.default
#' @export
#' @noRd
freq.difftime <- function(x, ...) {
  freq.default(x = x, ...,
               .add_header = list(units = attributes(x)$units))
}
}
Be sure to call \code{freq.default} in your function and not just \code{freq}. Also, add \code{cleaner} to the \code{Imports:} field of your \code{DESCRIPTION} file, to make sure that it will be installed with your package, e.g.:
\preformatted{
Imports: cleaner
}
}

\examples{
freq(unclean$gender, markdown = FALSE)

freq(x = clean_factor(unclean$gender, 
                      levels = c("^m" = "Male", 
                                 "^f" = "Female")),
     markdown = TRUE,
     title = "Frequencies of a cleaned version for a markdown report!",
     header = FALSE,
     quote = TRUE)
}
\keyword{freq}
\keyword{frequency}
\keyword{summarise}
\keyword{summary}
