R作图 Binomial Distribution（±）

二项分布和负二项分布，图示如下：

（1）二项分布Rcode

> N <- 100000
> n <- 100
> p <- 0.9
> x <- rbinom(N, n, p)
> hist(x,
+      xlim = c(min(x), max(x)),
+      probability = T, #概率函数，默认为频率函数
+      nclass = max(x) - min(x) +1,
+      col = 'lightblue',
+      main = "Binomial Distribution, n=100, p=.9" )
> lines(density(x,
+               bw=1, #控制函数的峰度？？
+               kernel = 'gaussian'), #默认的函数拟合是高斯函数
+       col = 'blue',
+       lwd = 3) #可以控制线宽

（2）负二项分布Rcode

> N <- 100000
> n <- 100
> p <- 0.9
> x <- rnbinom(N, n, p)
> hist(x,
+      xlim = c(min(x), max(x)),
+      probability = T,
+      nclass = max(x) - min(x) +1,
+      col = 'lightblue',
+      main = "Negative Binomial Distribution, n=100, p=.9" )
> lines(density(x,
+               bw=1,
+               kernel = 'gaussian'),
+       col = 'blue',
+       lwd = 3)

----------------------------------------------------------------------------------------------------------

函数简单解释：

(1) Gaussian function

In mathematics, a Gaussian function is a function of the form:

$f(x) = a e^{- { \frac{(x-b)^2 }{ 2 c^2} } }$

for some real constants a, b , c, and e ≈ 2.718281828(Euler's number)

（http://en.wikipedia.org/wiki/Gaussian_function）

(2) Histograms

Description

The generic function hist computes a histogram of the given data values. If plot=TRUE, the resulting object of class "histogram" is plotted by plot.histogram, before it is returned.

（http://127.0.0.1:12956/library/graphics/html/hist.html）

Usage

hist(x, ...)

## Default S3 method:
hist(x, breaks = "Sturges",
     freq = NULL, probability = !freq,
     include.lowest = TRUE, right = TRUE,
     density = NULL, angle = 45, col = NULL, border = NULL,
     main = paste("Histogram of" , xname),
     xlim = range(breaks), ylim = NULL,
     xlab = xname, ylab,
     axes = TRUE, plot = TRUE, labels = FALSE,
     nclass = NULL, warn.unused = TRUE, ...)

Arguments

`x`	a vector of values for which the histogram is desired.
`breaks`	one of: a vector giving the breakpoints between histogram cells, a single number giving the number of cells for the histogram, a character string naming an algorithm to compute the number of cells (see ‘Details’), a function to compute the number of cells. In the last three cases the number is a suggestion only.
`freq`	logical; if `TRUE`, the histogram graphic is a representation of frequencies, the `counts` component of the result; if `FALSE`, probability densities, component `density`, are plotted (so that the histogram has a total area of one). Defaults to `TRUE` if and only if `breaks` are equidistant (and `probability` is not specified).
`probability`	an alias for `!freq`, for S compatibility.
`include.lowest`	logical; if `TRUE`, an `x[i]` equal to the `breaks` value will be included in the first (or last, for `right = FALSE`) bar. This will be ignored (with a warning) unless `breaks` is a vector.
`right`	logical; if `TRUE`, the histogram cells are right-closed (left open) intervals.
`density`	the density of shading lines, in lines per inch. The default value of `NULL` means that no shading lines are drawn. Non-positive values of `density` also inhibit the drawing of shading lines.
`angle`	the slope of shading lines, given as an angle in degrees (counter-clockwise).
`col`	a colour to be used to fill the bars. The default of `NULL` yields unfilled bars.
`border`	the color of the border around the bars. The default is to use the standard foreground color.
`main, xlab, ylab`	these arguments to `title` have useful defaults here.
`xlim, ylim`	the range of x and y values with sensible defaults. Note that `xlim` is not used to define the histogram (breaks), but only for plotting (when `plot = TRUE`).
`axes`	logical. If `TRUE` (default), axes are draw if the plot is drawn.
`plot`	logical. If `TRUE` (default), a histogram is plotted, otherwise a list of breaks and counts is returned. In the latter case, a warning is used if (typically graphical) arguments are specified that only apply to the `plot = TRUE` case.
`labels`	logical or character. Additionally draw labels on top of bars, if not `FALSE`; see `plot.histogram`.
`nclass`	numeric (integer). For S(-PLUS) compatibility only, `nclass` is equivalent to `breaks` for a scalar or character argument.
`warn.unused`	logical. If `plot=FALSE` and `warn.unused=TRUE`, a warning will be issued when graphical parameters are passed to `hist.default()`.
`...`	further arguments and graphical parameters passed to `plot.histogram` and thence to `title` and `axis` (if `plot=TRUE`).

(3)Kernel Density Estimation

Description

The (S3) generic function density computes kernel density estimates. Its default method does so with the given kernel and bandwidth for univariate observations.

（http://127.0.0.1:12956/library/stats/html/density.html）

Usage

density(x, ...)
## Default S3 method:
density(x, bw = "nrd0", adjust = 1,
        kernel = c("gaussian", "epanechnikov", "rectangular",
                   "triangular", "biweight",
                   "cosine", "optcosine"),
        weights = NULL, window = kernel, width,
        give.Rkern = FALSE,
        n = 512, from, to, cut = 3, na.rm = FALSE, ...)

Arguments

`x`	the data from which the estimate is to be computed.
`bw`	the smoothing bandwidth to be used. The kernels are scaled such that this is the standard deviation of the smoothing kernel. (Note this differs from the reference books cited below, and from S-PLUS.) `bw` can also be a character string giving a rule to choose the bandwidth. See `bw.nrd`. The default, `"nrd0"`, has remained the default for historical and compatibility reasons, rather than as a general recommendation, where e.g., `"SJ"` would rather fit, see also V&R (2002). The specified (or computed) value of `bw` is multiplied by `adjust`.
`adjust`	the bandwidth used is actually `adjust*bw`. This makes it easy to specify values like ‘half the default’ bandwidth.
`kernel, window`	a character string giving the smoothing kernel to be used. This must be one of `"gaussian"`, `"rectangular"`, `"triangular"`, `"epanechnikov"`, `"biweight"`, `"cosine"` or `"optcosine"`, with default `"gaussian"`, and may be abbreviated to a unique prefix (single letter). `"cosine"` is smoother than `"optcosine"`, which is the usual ‘cosine’ kernel in the literature and almost MSE-efficient. However, `"cosine"` is the version used by S.
`weights`	numeric vector of non-negative observation weights, hence of same length as `x`. The default `NULL` is equivalent to `weights = rep(1/nx, nx)` where `nx` is the length of (the finite entries of) `x[]`.
`width`	this exists for compatibility with S; if given, and `bw` is not, will set `bw` to `width` if this is a character string, or to a kernel-dependent multiple of `width` if this is numeric.
`give.Rkern`	logical; if true, no density is estimated, and the ‘canonical bandwidth’ of the chosen `kernel` is returned instead.
`n`	the number of equally spaced points at which the density is to be estimated. When `n > 512`, it is rounded up to a power of 2 during the calculations (as `fft` is used) and the final result is interpolated by `approx`. So it almost always makes sense to specify `n` as a power of two.
`from,to`	the left and right-most points of the grid at which the density is to be estimated; the defaults are `cut * bw` outside of `range(x)`.
`cut`	by default, the values of `from` and `to` are `cut` bandwidths beyond the extremes of the data. This allows the estimated density to drop to approximately zero at the extremes.
`na.rm`	logical; if `TRUE`, missing values are removed from `x`. If `FALSE` any missing values cause an error.
`...`	further arguments for (non-default) methods.

posted on 2012-12-27 09:23 半个馒头阅读(3804) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部