📖 概述
不同数据库下载的GWAS数据列名可能不同,本教程汇总常见列名对照关系。
🔄 常见列名对照表
| 标准名称 | 常见别名 | 说明 |
|---|---|---|
| SNP | rsid, rsids, Markername | SNP的rsID编号 |
| beta | estimate, effect size, ES, effect | 效应量(连续型变量) |
| se | StdErr, standard error | 效应量的标准误 |
| eaf | effect_allele_frequency, af_alt, freq | 效应等位基因频率 |
| effect_allele | Allele1, alt, A1 | 效应等位基因 |
| other_allele | Allele2, ref, A2 | 其他等位基因 |
| pval | P, p-value, mlogp | 显著性p值 |
| chr | chromosome, CHR | 染色体编号 |
🗄️ 各数据库列名
GWAS Catalog
下载后查看Readme.txt获取列名定义
UK Biobank
- • alt = effect allele(效应等位基因)
- • ref = other allele(其他等位基因)
FinnGen
- • ref = other allele
- • alt = effect allele
- • rsids = SNP
- • mlogp = -log10(p-value)
- • sebeta = se
- • af_alt = eaf
💻 数据标准化代码
library(dplyr)
# 列名标准化
standardize_columns <- function(data) {
# 创建列名映射
col_mapping <- c(
"rsid" = "SNP",
"rsids" = "SNP",
"Markername" = "SNP",
"estimate" = "beta",
"effect_size" = "beta",
"ES" = "beta",
"effect" = "beta",
"StdErr" = "se",
"standard_error" = "se",
"effect_allele_frequency" = "eaf",
"af_alt" = "eaf",
"freq" = "eaf",
"Allele1" = "effect_allele",
"alt" = "effect_allele",
"A1" = "effect_allele",
"Allele2" = "other_allele",
"ref" = "other_allele",
"A2" = "other_allele",
"P" = "pval",
"p-value" = "pval",
"mlogp" = "pval"
)
# 重命名列
data %>% rename(any_of(col_mapping))
}