有时,R数据框中的列值具有与之关联的单引号,并且要执行分析,我们需要删除该引号。因此,要从字符串列中删除单引号,我们可以使用gsub函数,方法是定义单引号并将其替换blank(not space)为下面的示例所示。
考虑以下数据帧-
x1<-sample(c("India'","Sudan'","Croatia'"),20,replace=TRUE) x2<-rpois(20,5) df1<-data.frame(x1,x2) df1输出结果
x1 x2 1 India' 6 2 Sudan' 3 3 Croatia' 9 4 Croatia' 3 5 Sudan' 4 6 Croatia' 4 7 India' 4 8 Croatia' 6 9 India' 4 10 Croatia' 7 11 Sudan' 8 12 India' 3 13 Croatia' 4 14 Sudan' 6 15 Sudan' 3 16 India' 11 17 Croatia' 8 18 Sudan' 6 19 Sudan' 10 20 Sudan' 5
从df1的x1列中删除'-
df1$x1<-gsub("'","",df1$x1) df1输出结果
x1 x2 1 India 6 2 Sudan 3 3 Croatia 9 4 Croatia 3 5 Sudan 4 6 Croatia 4 7 India 4 8 Croatia 6 9 India 4 10 Croatia 7 11 Sudan 8 12 India 3 13 Croatia 4 14 Sudan 6 15 Sudan 3 16 India 11 17 Croatia 8 18 Sudan 6 19 Sudan 10 20 Sudan 5
y1<-sample(c("'A'","'B'","'C'"),20,replace=TRUE) y2<-rnorm(20,1) df2<-data.frame(y1,y2) df2输出结果
y1 y2 1 'B' 0.49282668 2 'B' -0.90061585 3 'B' 0.89346759 4 'A' 1.96469552 5 'A' 1.21931750 6 'A' 0.32022463 7 'A' 0.97912117 8 'B' 1.38781374 9 'C' -0.69066318 10 'B' 1.45014864 11 'C' 1.61876980 12 'C' 1.69046763 13 'C' -0.08073507 14 'B' 1.73212908 15 'C' 0.85473489 16 'B' -0.24975030 17 'A' 0.40313471 18 'B' 0.60537047 19 'B' 0.30200882 20 'C' 2.29497113
从df2的y1列中删除'-
df2$y1<-gsub("'","",df2$y1) df2输出结果
y1 y2 1 B 0.49282668 2 B -0.90061585 3 B 0.89346759 4 A 1.96469552 5 A 1.21931750 6 A 0.32022463 7 A 0.97912117 8 B 1.38781374 9 C -0.69066318 10 B 1.45014864 11 C 1.61876980 12 C 1.69046763 13 C -0.08073507 14 B 1.73212908 15 C 0.85473489 16 B -0.24975030 17 A 0.40313471 18 B 0.60537047 19 B 0.30200882 20 C 2.29497113