如何在R中的特殊字符后删除部分字符串?

有时我们不需要整个字符串来进行分析,尤其是当它使分析变得复杂或毫无意义时。在这种情况下,可以从完整的字符串中删除我们认为不必要的字符串部分。例如,假设我们有一个字符串ID:00001-1,但我们不想在该字符串中使用-1,那么我们可以将其删除,这可以通过gsub函数来完成。

示例

> x1<-c("ID:00001-1","ID:00100-1","ID:00201-4","ID:014700-3","ID:12045-5","ID:00012-2","ID:10078-3")
> gsub("\\-.*","",x1)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x2<-c("ID:00001/1","ID:00100/1","ID:00201/4","ID:014700/3","ID:12045/5","ID:00012/2","ID:10078/3")
> gsub("\\/.*","",x2)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x3<-c("ID:00001_1","ID:00100_1","ID:00201_4","ID:014700_3","ID:12045_5","ID:00012_2","ID:10078_3")
> gsub("\\_.*","",x3)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x4<-c("ID:00001@1","ID:00100@1","ID:00201@4","ID:014700@3","ID:12045@5","ID:00012@2","ID:10078@3")
> gsub("\\@.*","",x4)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x5<-c("ID:00001*1","ID:00100*1","ID:00201*4","ID:014700*3","ID:12045*5","ID:00012*2","ID:10078*3")
> gsub("\\*.*","",x5)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x6<-c("ID:00001#1","ID:00100#1","ID:00201#4","ID:014700#3","ID:12045#5","ID:00012#2","ID:10078#3")
> gsub("\\#.*","",x6)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x7<-c("ID:00001()1","ID:00100()1","ID:00201()4","ID:014700()3","ID:12045()5","ID:00012()2","ID:10078()3")
> gsub("\\().*","",x7)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x8<-c("ID:00001<>1","ID:00100<>1","ID:00201<>4","ID:014700<>3","ID:12045<>5","ID:00012<>2","ID:10078<>3")
> gsub("\\<>.*","",x8)
[1] "ID:00001<>1" "ID:00100<>1" "ID:00201<>4" "ID:014700<>3" "ID:12045<>5" "ID:00012<>2" "ID:10078<>3"
> x9<-c("ID:00001&1","ID:00100&1","ID:00201&4","ID:014700&3","ID:12045&5","ID:00012&2","ID:10078&3")
> gsub("\\&.*","",x9)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"
> x10<-c("ID:00001;1","ID:00100;1","ID:00201;4","ID:014700;3","ID:12045;5","ID:00012;2","ID:10078;3")
> gsub("\\;.*","",x10)
[1] "ID:00001" "ID:00100" "ID:00201" "ID:014700" "ID:12045" "ID:00012" "ID:10078"