R语言统计学习和数据可视化学习笔记
欢迎
欢迎来到R语言统计学习和数据可视化的学习笔记,本项目的代码主要有以下三个来源:
笔记主要分为两个部分,第一部分是统计学习导论-基于R应用的课后练习源码,该部分也会简单分享一些自己的见解。第二部分是ggplot2可视化学习笔记,主要包含ggplot2的一些高级用法,包括ggplot2内部如何工作,编写ggplot2扩展原理和利用ggplot2进行高级绘图。ggplot2部分可能不太适合初学者,建议先了解ggplot2: Elegant Graphics for Data Analysis (3e)。
本项目旨在记录自己学习R语言统计学习和数据可视化的过程,如有错误,欢迎在issue中提出。
第一部分所用到的数据集如下:
Dataset | Information |
---|---|
Auto | Gas mileage, horsepower, and other information for cars. |
Boston | Housing values and other information about Boston suburbs. |
Caravan | Information about individuals offered caravan insurance. |
Carseats | Information about car seat sales in 400 stores. |
College | Demographic characteristics, tuition, and more for USA colleges. |
Default | Customer default records for a credit card company. |
Hitters | Records and salaries for baseball players. |
Khan | Gene expression measurements for four cancer types. |
NCI60 | Gene expression measurements for 64 cancer cell lines. |
OJ | Sales information for Citrus Hill and Minute Maid orange juice. |
Portfolio | Past values of financial assets, for use in portfolio allocation. |
Smarket | Daily percentage returns for S&P 500 over a 5-year period. |
USArrests | Crime statistics per 100,000 residents in 50 states of USA. |
Wage | Income survey data for males in central Atlantic region of USA. |
Weekly | 1,089 weekly stock market returns for 21 years. |
作者:Feel Liao