---
title: "Day_4_代码作业_06-10"
author: "领学人-Angus-心内科"
date: "`r Sys.Date()`"
output:
prettydoc::html_pretty:
theme: cayman
# theme: tactile
# theme: hpstr
highlight: github
# highlight: vignette
---
### 0.生成data
```{r}
df <- data.frame(
"Medical.Science"=c("clinical medicine",
"Nursing","Basic Medicine",
"pharmacy"),
"score"=c(1,2,NA,4))
head(df)1.1将缺失值用上下值的平均值填充
1.1.1简单粗暴
{r} # 比如观察到NA在第二列 df[,2][is.na(df)[,2]] <- mean(c(df$score[2],df$score[4])) # 或者写成 # df$score[is.na(df)[,2]] <- mean(df$score,na.rm = TRUE) head(df)
1.1.2使用Hmisc包
{r} #Hmisc包 # install.packages("Hmisc") is.na(df)#确定NA所属的变量 index <- which(is.na(df$score))#创建索引 print(index) suppressMessages(library(Hmisc)) #处理缺失值的R包 df$score <- impute(df$score,mean(c(df[index-1,2],df[index+1,2])))#插入数据 head(df)
1.1.3 使用zoo包
{r} #zoo包 install.packages("zoo") library(zoo) suppressMessages(library(tidyverse)) df %>%mutate(score=zoo::na.approx(score))
1.2将缺失值用该列的平均值填充
```{r} # 比如观察到NA在第二列 # df[,2][is.na(df)[,2]] <- mean(df$score,na.rm = TRUE) # 或者写成 # df$score[is.na(df)[,2]] <- mean(df$score,na.rm = TRUE) # head(df)
### 2.提取score中大于3的行
```{r}
df[which(df$score > 3),]
# 或着用filter函数
suppressMessages(library(tidyverse))
filter(df,df$score >3) # 返回的是data.frame
3.按照score列进行去重
{r} df[!duplicated(df$score),]
4.计算score列的平均值
{r} mean(df$score) summary(df$score) # 给你多算几个
5.将score列提取出来转换为向量
{r} a <- factor(df$score, levels = c(1,2,3,4))str(a) ``` 相关主题:关于代码作业06-10中缺失值填补的问题_Mage