用途drop_duplicates:
In [216]: df = pd.DataFrame({'A':[1,2,3,3,2], ...: 'B':[1,7,3,0,8]}) In [217]: df Out[217]: A B 0 1 1 1 2 7 2 3 3 3 3 0 4 2 8 # keep only the last value In [218]: df.drop_duplicates(subset=['A'], keep='last') Out[218]: A B 0 1 1 3 3 0 4 2 8 # keep only the first value, default value In [219]: df.drop_duplicates(subset=['A'], keep='first') Out[219]: A B 0 1 1 1 2 7 2 3 3 # drop all duplicated values In [220]: df.drop_duplicates(subset=['A'], keep=False) Out[220]: A B 0 1 1
当您不想获取数据框的副本,而要修改现有的数据框时:
In [221]: df = pd.DataFrame({'A':[1,2,3,3,2], ...: 'B':[1,7,3,0,8]}) In [222]: df.drop_duplicates(subset=['A'], inplace=True) In [223]: df Out[223]: A B 0 1 1 1 2 7 2 3 3