+1 (218) 451-4151
glass
pen
clip
papers
heaphones

Drills with R on importing and plotting data, and finding the distribution measures

Drills with R on importing and plotting data, and finding the distribution measures

Provide in the plain text R commands that finds/solves the following:  

  1. The student directory for a large university has 400 pages with 130 names per page, a total of 52,000 names. Using software, show how to select a simple random sample of 10 names. 
  2.  From the Murder data file, use the variable murder, which is the murder rate (per 100,000 population) for each state in the U.S. in 2017 according to the FBI Uniform Crime Reports. At first, do not use the observation for D.C. (DC). Using software:
    1. Find the mean and standard deviation and interpret their values.
    2. Find the five-number summary, and construct the corresponding boxplot.
    3. Now include the observation for D.C. What is affected more by this outlier: The mean or the median? 
  3. The Houses data file lists the selling price (thousands of dollars), size (square feet), tax bill (dollars), number of bathrooms, number of bedrooms, and whether the house is new (1 = yes,0 = no) for 100 home sales in Gainesville, Florida. Let’s analyze the selling prices.
    1. Construct a frequency distribution and a histogram.
    2. Find the percentage of observations that fall within one standard deviation of the mean.
    3. Construct a boxplot. 

Datasets needed are at Index of Data SetsUseful functions in R to solve problems in this assignment: sample, read.table, mean, sd, summary, boxplot, hist, table, cbind, length, case, tapply