Java 8 Stream.distinct() 列表去重的操作

在这篇文章里,我们将提供Java8 Stream distinct()示例。 distinct()返回由该流的不同元素组成的流。distinct()是Stream接口的方法。

distinct()使用hashCode()和equals()方法来获取不同的元素。因此,我们的类必须实现hashCode()和equals()方法。

如果distinct()正在处理有序流,那么对于重复元素,将保留以遭遇顺序首先出现的元素,并且以这种方式选择不同元素是稳定的。

在无序流的情况下,不同元素的选择不一定是稳定的,是可以改变的。distinct()执行有状态的中间操作。

在有序流的并行流的情况下,保持distinct()的稳定性是需要很高的代价的,因为它需要大量的缓冲开销。如果我们不需要保持遭遇顺序的一致性,那么我们应该可以使用通过BaseStream.unordered()方法实现的无序流。

1. Stream.distinct()

distinct()方法的声明如下:

Stream<T> distinct()

它是Stream接口的方法。在此示例中,我们有一个包含重复元素的字符串数据类型列表

DistinctSimpleDemo.java

package com.concretepage;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
public class DistinctSimpleDemo {
 public static void main(String[] args) {
 List<String> list = Arrays.asList("AA", "BB", "CC", "BB", "CC", "AA", "AA");
 long l = list.stream().distinct().count();
 System.out.println("No. of distinct elements:"+l);
 String output = list.stream().distinct().collect(Collectors.joining(","));
 System.out.println(output);
 }
} 

Output

No. of distinct elements:3

AA,BB,CC

2. Stream.distinct() with List of Objects

在此示例中,我们有一个Book对象列表。 为了对列表进行去重,该类将重写hashCode()和equals()。

Book.java

package com.concretepage;
public class Book {
 private String name;
 private int price;
 public Book(String name, int price) {
 this.name = name;
 this.price = price;
 }
 public String getName() {
 return name;
 }
 public int getPrice() {
 return price;
 }
 @Override
 public boolean equals(final Object obj) {
 if (obj == null) {
 return false;
 }
 final Book book = (Book) obj;
 if (this == book) {
 return true;
 } else {
 return (this.name.equals(book.name) && this.price == book.price);
 }
 }
 @Override
 public int hashCode() {
 int hashno = 7;
 hashno = 13 * hashno + (name == null ? 0 : name.hashCode());
 return hashno;
 }
} 

DistinctWithUserObjects.java

package com.concretepage;
import java.util.ArrayList;
import java.util.List;
public class DistinctWithUserObjects {
 public static void main(String[] args) {
 List<Book> list = new ArrayList<>();
 {
 list.add(new Book("Core Java", 200));
 list.add(new Book("Core Java", 200));
 list.add(new Book("Learning Freemarker", 150)); 
 list.add(new Book("Spring MVC", 300));
 list.add(new Book("Spring MVC", 300));
 }
 long l = list.stream().distinct().count();
 System.out.println("No. of distinct books:"+l);
 list.stream().distinct().forEach(b -> System.out.println(b.getName()+ "," + b.getPrice()));
 }
}

Output

No. of distinct books:3
Core Java,200
Learning Freemarker,150
Spring MVC,300 

3. Distinct by Property

distinct()不提供按照属性对对象列表进行去重的直接实现。它是基于hashCode()和equals()工作的。

如果我们想要按照对象的属性,对对象列表进行去重,我们可以通过其它方法来实现。

如下代码段所示:

static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
 Map<Object,Boolean> seen = new ConcurrentHashMap<>();
 return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
} 

上面的方法可以被Stream接口的 filter()接收为参数,如下所示:

list.stream().filter(distinctByKey(b -> b.getName()));

distinctByKey()方法返回一个使用ConcurrentHashMap 来维护先前所见状态的 Predicate 实例,如下是一个完整的使用对象属性来进行去重的示例。

DistinctByProperty.java

package com.concretepage;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.function.Function;
import java.util.function.Predicate;
public class DistinctByProperty {
 public static void main(String[] args) {
 List<Book> list = new ArrayList<>();
 {
 list.add(new Book("Core Java", 200));
 list.add(new Book("Core Java", 300));
 list.add(new Book("Learning Freemarker", 150));
 list.add(new Book("Spring MVC", 200));
 list.add(new Book("Hibernate", 300));
 }
 list.stream().filter(distinctByKey(b -> b.getName()))
 .forEach(b -> System.out.println(b.getName()+ "," + b.getPrice())); 
 }
 private static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
 Map<Object,Boolean> seen = new ConcurrentHashMap<>();
 return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
 }
} 

Output

Core Java,200
Learning Freemarker,150
Spring MVC,200
Hibernate,300 

from : https://www.concretepage.com/java/jdk-8/java-8-distinct-example

补充知识:List集合常规去重与java8新特性去重方法

一、常规去重

碰到List去重的问题,除了遍历去重,我们常常想到利用Set集合不允许重复元素的特点,通过List和Set互转,来去掉重复元素。

// 遍历后判断赋给另一个list集合,保持原来顺序
 public static void ridRepeat1(List<String> list) {
  System.out.println("list = [" + list + "]");
  List<String> listNew = new ArrayList<String>();
  for (String str : list) {
   if (!listNew.contains(str)) {
    listNew.add(str);
   }
  }
  System.out.println("listNew = [" + listNew + "]");
 }
 // set集合去重,保持原来顺序
 public static void ridRepeat2(List<String> list) {
  System.out.println("list = [" + list + "]");
  List<String> listNew = new ArrayList<String>();
  Set set = new HashSet();
  for (String str : list) {
   if (set.add(str)) {
    listNew.add(str);
   }
  }
  System.out.println("listNew = [" + listNew + "]");
 }
 // Set去重  由于Set的无序性,不会保持原来顺序
 public static void ridRepeat3(List<String> list) {
  System.out.println("list = [" + list + "]");
  Set set = new HashSet();
  List<String> listNew = new ArrayList<String>();
  set.addAll(list);
  listNew.addAll(set);
  System.out.println("listNew = [" + listNew + "]");
 }
 // Set去重(将ridRepeat3方法缩减为一行) 无序
 public static void ridRepeat4(List<String> list) {
  System.out.println("list = [" + list + "]");
  List<String> listNew = new ArrayList<String>(new HashSet(list));
  System.out.println("listNew = [" + listNew + "]");
 }
 // Set去重并保持原先顺序
 public static void ridRepeat5(List<String> list) {
  System.out.println("list = [" + list + "]");
  List<String> listNew2= new ArrayList<String>(new LinkedHashSet<String>(list));
  System.out.println("listNew = [" + listNew + "]");
 }

二、java8的stream写法实现去重

1、distinct去重

//利用java8的stream去重
 List uniqueList = list.stream().distinct().collect(Collectors.toList());
 System.out.println(uniqueList.toString());

distinct()方法默认是按照父类Object的equals与hashCode工作的。所以:

上面的方法在List元素为基本数据类型及String类型时是可以的,但是如果List集合元素为对象,却不会奏效。不过如果你的实体类对象使用了目前广泛使用的lombok插件相关注解如:@Data,那么就会自动帮你重写了equals与hashcode方法,当然如果你的需求是根据某几个核心字段属性判断去重,那么你就要在该类中自定义重写equals与hashcode方法了。

2、也可以通过新特性简写方式实现

不过该方式不能保持原列表顺序而是使用了TreeSet按照字典顺序排序后的列表,如果需求不需要按原顺序则可直接使用。

//根据name属性去重
List<User> lt = list.stream().collect(
  collectingAndThen(
    toCollection(() -> new TreeSet<>(Comparator.comparing(User::getName))), ArrayList::new));
System.out.println("去重后的:" + lt);
//根据name与address属性去重
List<User> lt1 = list.stream().collect(
  collectingAndThen(
    toCollection(() -> new TreeSet<>(Comparator.comparing(o -> o.getName() + ";" + o.getAddress()))), ArrayList::new));
System.out.println("去重后的:" + lt);

当需求中明确有排序要求也可以按上面简写方式再次加工处理使用stream流的sorted()相关API写法。

List<User> lt = list.stream().collect(
    collectingAndThen(
      toCollection(() -> new TreeSet<>(Comparator.comparing(User::getName))),v -> v.stream().sorted().collect(Collectors.toList())));

3、通过 filter() 方法

我们首先创建一个方法作为 Stream.filter() 的参数,其返回类型为 Predicate,原理就是判断一个元素能否加入到 Set 中去,代码如下:

private static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
 Set<Object> seen = ConcurrentHashMap.newKeySet();
 return t -> seen.add(keyExtractor.apply(t));
}

使用如下:

@Test
 public void distinctByProperty() throws JsonProcessingException {
 // 这里第二种方法我们通过过滤来实现根据对象某个属性去重
 ObjectMapper objectMapper = new ObjectMapper();
 List<Student> studentList = getStudentList();
 
 System.out.print("去重前  :");
 System.out.println(objectMapper.writeValueAsString(studentList));
 studentList = studentList.stream().distinct().collect(Collectors.toList());
 System.out.print("distinct去重后:");
 System.out.println(objectMapper.writeValueAsString(studentList));
 // 这里我们将 distinctByKey() 方法作为 filter() 的参数,过滤掉那些不能加入到 set 的元素
 studentList = studentList.stream().filter(distinctByKey(Student::getName)).collect(Collectors.toList());
 System.out.print("根据名字去重后 :");
 System.out.println(objectMapper.writeValueAsString(studentList));
 }

去重前:

[{"stuNo":"001","name":"Tom"},{"stuNo":"001","name":"Tom"},{"stuNo":"003","name":"Tom"}]

distinct去重后:

[{"stuNo":"001","name":"Tom"},{"stuNo":"003","name":"Tom"}]

根据名字去重后 :

[{"stuNo":"001","name":"Tom"}]

三、相同元素累计求和等操作

除了集合去重意外,工作中还有一种常见的需求,例如:在所有商品订单中,计算同一家店铺不同商品名称的商品成交额,可以直接通过sql语句获取,这里写一下如何通过java简单实现。举一个类似的案例:计算相同姓名与住址的用户年龄之和。

User.java

package com.example.demo.dto;
import java.io.Serializable;
import java.util.Objects;
/**
 * @author: shf
 * description:
 * date: 2019/10/30 10:21
 */
public class User implements Serializable {
 private static final long serialVersionUID = 1L;
 private Long id;
 private String name;
 private String address;
 private Integer age;
 public User() {
 }
 public User(String name, String address, Integer age) {
  this.name = name;
  this.address = address;
  this.age = age;
 }
 public String getName() {
  return name;
 }
 public void setName(String name) {
  this.name = name;
 }
 public String getAddress() {
  return address;
 }
 public void setAddress(String address) {
  this.address = address;
 }
 public Integer getAge() {
  return age;
 }
 public void setAge(Integer age) {
  this.age = age;
 }
 @Override
 public String toString() {
  return "User{" +
    "name='" + name + '\'' +
    ", address='" + address + '\'' +
    ", age=" + age +
    '}';
 }
 @Override
 public boolean equals(Object obj) {
  if (this == obj) {
   return true;//地址相等
  }
  if (obj == null) {
   return false;//非空性:对于任意非空引用x,x.equals(null)应该返回false。
  }
  if (obj instanceof User) {
   User other = (User) obj;
   //需要比较的字段相等,则这两个对象相等
   if (Objects.equals(this.name, other.name)
     && Objects.equals(this.address, other.address)) {
    return true;
   }
  }
  return false;
 }
 @Override
 public int hashCode() {
  return Objects
    .hash(name, address);
 }
}

测试代码:

package com.example.demo;
import com.example.demo.dto.User;
import java.util.*;
import java.util.stream.Collectors;
public class FirCes {
 public static void main(String[] args) {
  /*构建测试数据集合*/
  User user1 = new User("a小张1", "a1", 10);
  User user2 = new User("b小张2", "a2", 10);
  User user3 = new User("c小张3", "a3", 10);
  User user3_3 = new User("c小张3", "a", 10);
  User user33 = new User("c小张3", "a3", 10);
  User user4 = new User("d小张4", "a4", 10);
  User user5 = new User("e小张5", "a5", 10);
  List<User> list = new ArrayList<>();
  list.add(user1);
  list.add(user2);
  list.add(user3);
  list.add(user3_3);
  list.add(user33);
  list.add(user4);
  list.add(user5);
  //按相同name与address属性分组User用户
  Map<User, List<User>> listMap = list.stream().collect(Collectors.groupingBy(v -> v));
  /*先看一下分组效果*/
  listMap.forEach((key, value) -> {
   System.out.println("========");
   System.out.println("key:" + key);
   value.forEach(obj -> {
    System.out.println(obj);
   });
  });
  /*最终执行结果*/
  List<User> listNew = listMap.keySet().stream().map(u -> {
   int sum = listMap.get(u).stream().mapToInt(i -> i.getAge()).sum();
   //需要注意的是:这里也会改变原list集合中的原数据。因为这里的u分组时就是来自原集合中的一个地址对象,
   // 即:指向了原集合中的一个对象的地址。如果不想原集合被影响,这里可以new User()新的对象赋值并返回新对象
   u.setAge(sum);
   return u;
  }).collect(Collectors.toList());
  System.out.println("listNew:" + listNew);
  System.err.println("list:" + list);
  //但是一个实体类只能重写一次equals方法,如果有多种判别需求就不好满足了,
  // 可以定义多个不同类名相同属性的类或者下面这种方式解决
  Map<String, List<User>> listMap1 = list.stream().collect(Collectors
    .groupingBy(v -> Optional.ofNullable(v.getName()).orElse("") + "_" + Optional.ofNullable(v.getAddress()).orElse("")));
  /*先看一下分组效果*/
  listMap1.forEach((key, value) -> {
   System.out.println("========");
   System.out.println("key:" + key);
   value.forEach(obj -> {
    System.out.println(obj);
   });
  });
  /*最终执行结果*/
  List<User> listNew1 = listMap1.keySet().stream().map(u -> {
   int sum = listMap1.get(u).stream().mapToInt(i -> i.getAge()).sum();
   User user = listMap1.get(u).get(0);
   //这里和上面一样的原理,也会影响原list集合中的被指向的地址的对象数据
   user.setAge(sum);
   return user;
  }).collect(Collectors.toList());
  System.out.println("listNew1:" + listNew1);
  System.err.println("list:" + list);
 }
 
}

打印日志:

========
key:User{name='b小张2', address='a2', age=10}
User{name='b小张2', address='a2', age=10}
========
key:User{name='c小张3', address='a', age=10}
User{name='c小张3', address='a', age=10}
========
key:User{name='c小张3', address='a3', age=10}
User{name='c小张3', address='a3', age=10}
User{name='c小张3', address='a3', age=10}
========
key:User{name='a小张1', address='a1', age=10}
User{name='a小张1', address='a1', age=10}
========
key:User{name='d小张4', address='a4', age=10}
User{name='d小张4', address='a4', age=10}
========
key:User{name='e小张5', address='a5', age=10}
User{name='e小张5', address='a5', age=10}
listNew:[User{name='b小张2', address='a2', age=10}, User{name='c小张3', address='a', age=10}, User{name='c小张3', address='a3', age=20}, User{name='a小张1', address='a1', age=10}, User{name='d小张4', address='a4', age=10}, User{name='e小张5', address='a5', age=10}]
list:[User{name='a小张1', address='a1', age=10}, User{name='b小张2', address='a2', age=10}, User{name='c小张3', address='a3', age=20}, User{name='c小张3', address='a', age=10}, User{name='c小张3', address='a3', age=10}, User{name='d小张4', address='a4', age=10}, User{name='e小张5', address='a5', age=10}]
========
key:a小张1_a1
User{name='a小张1', address='a1', age=10}
========
key:c小张3_a
User{name='c小张3', address='a', age=10}
========
key:d小张4_a4
User{name='d小张4', address='a4', age=10}
========
key:e小张5_a5
User{name='e小张5', address='a5', age=10}
========
key:b小张2_a2
User{name='b小张2', address='a2', age=10}
========
key:c小张3_a3
User{name='c小张3', address='a3', age=20}
User{name='c小张3', address='a3', age=10}
listNew1:[User{name='a小张1', address='a1', age=10}, User{name='c小张3', address='a', age=10}, User{name='d小张4', address='a4', age=10}, User{name='e小张5', address='a5', age=10}, User{name='b小张2', address='a2', age=10}, User{name='c小张3', address='a3', age=30}]
list:[User{name='a小张1', address='a1', age=10}, User{name='b小张2', address='a2', age=10}, User{name='c小张3', address='a3', age=30}, User{name='c小张3', address='a', age=10}, User{name='c小张3', address='a3', age=10}, User{name='d小张4', address='a4', age=10}, User{name='e小张5', address='a5', age=10}]
Process finished with exit code 0

以上这篇Java 8 Stream.distinct() 列表去重的操作就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持呐喊教程。

声明:本文内容来源于网络,版权归原作者所有,内容由互联网用户自发贡献自行上传,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任。如果您发现有涉嫌版权的内容,欢迎发送邮件至:notice#nhooo.com(发邮件时,请将#更换为@)进行举报,并提供相关证据,一经查实,本站将立刻删除涉嫌侵权内容。