如果需要将值设置0为column B,则在columnA中的重复数据中,首先使用创建掩码Series.duplicated,然后使用DataFrame.ix或Series.mask:
In [224]: df = pd.DataFrame({'A':[1,2,3,3,2], ...: 'B':[1,7,3,0,8]}) In [225]: mask = df.A.duplicated(keep=False) In [226]: mask Out[226]: 0 False 1 True 2 True 3 True 4 True Name: A, dtype: bool In [227]: df.ix[mask, 'B'] = 0 In [228]: df['C'] = df.A.mask(mask, 0) In [229]: df Out[229]: A B C 0 1 1 1 1 2 0 0 2 3 0 0 3 3 0 0 4 2 0 0
如果需要反面罩使用~:
In [230]: df['C'] = df.A.mask(~mask, 0) In [231]: df Out[231]: A B C 0 1 1 0 1 2 0 2 2 3 0 3 3 3 0 3 4 2 0 2