List集合常规去重与java8新特性去重方法
一、常规去重
碰到List去重的问题,除了遍历去重,我们常常想到利用Set集合不允许重复元素的特点,通过List和Set互转,来去掉重复元素。
// 遍历后判断赋给另一个list集合,保持原来顺序 public static void ridRepeat1(List<String> list) { System.out.println("list = [" + list + "]"); List<String> listNew = new ArrayList<String>(); for (String str : list) { if (!listNew.contains(str)) { listNew.add(str); } } System.out.println("listNew = [" + listNew + "]"); } // set集合去重,保持原来顺序 public static void ridRepeat2(List<String> list) { System.out.println("list = [" + list + "]"); List<String> listNew = new ArrayList<String>(); Set set = new HashSet(); for (String str : list) { if (set.add(str)) { listNew.add(str); } } System.out.println("listNew = [" + listNew + "]"); } // Set去重 由于Set的无序性,不会保持原来顺序 public static void ridRepeat3(List<String> list) { System.out.println("list = [" + list + "]"); Set set = new HashSet(); List<String> listNew = new ArrayList<String>(); set.addAll(list); listNew.addAll(set); System.out.println("listNew = [" + listNew + "]"); } // Set去重(将ridRepeat3方法缩减为一行) 无序 public static void ridRepeat4(List<String> list) { System.out.println("list = [" + list + "]"); List<String> listNew = new ArrayList<String>(new HashSet(list)); System.out.println("listNew = [" + listNew + "]"); } // Set去重并保持原先顺序 public static void ridRepeat5(List<String> list) { System.out.println("list = [" + list + "]"); List<String> listNew2= new ArrayList<String>(new LinkedHashSet<String>(list)); System.out.println("listNew = [" + listNew + "]"); }
二、java8的stream写法实现去重
1、distinct去重
//利用java8的stream去重 List uniqueList = list.stream().distinct().collect(Collectors.toList()); System.out.println(uniqueList.toString());
distinct()方法默认是按照父类Object的equals与hashCode工作的。所以:
上面的方法在List元素为基本数据类型及String类型时是可以的,但是如果List集合元素为对象,却不会奏效。不过如果你的实体类对象使用了目前广泛使用的lombok插件相关注解如:@Data,那么就会自动帮你重写了equals与hashcode方法,当然如果你的需求是根据某几个核心字段属性判断去重,那么你就要在该类中自定义重写equals与hashcode方法了。
2、也可以通过新特性简写方式实现
不过该方式不能保持原列表顺序而是使用了TreeSet按照字典顺序排序后的列表,如果需求不需要按原顺序则可直接使用。
//根据name属性去重 List<User> lt = list.stream().collect( Collectors.collectingAndThen( toCollection(() -> new TreeSet<>(Comparator.comparing(User::getName))), ArrayList::new)); System.out.println("去重后的:" + lt); //根据name与address属性去重 List<User> lt1 = list.stream().collect( Collectors.collectingAndThen( toCollection(() -> new TreeSet<>(Comparator.comparing(o -> o.getName() + ";" + o.getAddress()))), ArrayList::new)); System.out.println("去重后的:" + lt);
当需求中明确有排序要求也可以按上面简写方式再次加工处理使用stream流的sorted()相关API写法。
List<User> lt = list.stream().collect( Collectors.collectingAndThen( toCollection(() -> new TreeSet<>(Comparator.comparing(User::getName))),v -> v.stream().sorted().collect(Collectors.toList())));
3、通过 filter()
方法
我们首先创建一个方法作为 Stream.filter()
的参数,其返回类型为 Predicate
,原理就是判断一个元素能否加入到 Set
中去,代码如下:
private static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) { Set<Object> seen = ConcurrentHashMap.newKeySet(); return t -> seen.add(keyExtractor.apply(t)); }
使用如下:
@Test public void distinctByProperty() throws JsonProcessingException { // 这里第二种方法我们通过过滤来实现根据对象某个属性去重 ObjectMapper objectMapper = new ObjectMapper(); List<Student> studentList = getStudentList(); System.out.print("去重前 :"); System.out.println(objectMapper.writeValueAsString(studentList)); studentList = studentList.stream().distinct().collect(Collectors.toList()); System.out.print("distinct去重后:"); System.out.println(objectMapper.writeValueAsString(studentList)); // 这里我们将 distinctByKey() 方法作为 filter() 的参数,过滤掉那些不能加入到 set 的元素 studentList = studentList.stream().filter(distinctByKey(Student::getName)).collect(Collectors.toList()); System.out.print("根据名字去重后 :"); System.out.println(objectMapper.writeValueAsString(studentList)); }
去重前 :[{"stuNo":"001","name":"Tom"},{"stuNo":"001","name":"Tom"},{"stuNo":"003","name":"Tom"}] distinct去重后:[{"stuNo":"001","name":"Tom"},{"stuNo":"003","name":"Tom"}] 根据名字去重后 :[{"stuNo":"001","name":"Tom"}]
三、相同元素累计求和等操作
除了集合去重意外,工作中还有一种常见的需求,例如:在所有商品订单中,计算同一家店铺不同商品名称的商品成交额,可以直接通过sql语句获取,这里写一下如何通过java简单实现。举一个类似的案例:计算相同姓名与住址的用户年龄之和。
User.java
package com.example.demo.dto; import java.io.Serializable; import java.util.Objects; /** * @author: shf * description: * date: 2019/10/30 10:21 */ public class User implements Serializable { private static final long serialVersionUID = 1L; private Long id; private String name; private String address; private Integer age; public User() { } public User(String name, String address, Integer age) { this.name = name; this.address = address; this.age = age; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getAddress() { return address; } public void setAddress(String address) { this.address = address; } public Integer getAge() { return age; } public void setAge(Integer age) { this.age = age; } @Override public String toString() { return "User{" + "name='" + name + '\'' + ", address='" + address + '\'' + ", age=" + age + '}'; } @Override public boolean equals(Object obj) { if (this == obj) { return true;//地址相等 } if (obj == null) { return false;//非空性:对于任意非空引用x,x.equals(null)应该返回false。 } if (obj instanceof User) { User other = (User) obj; //需要比较的字段相等,则这两个对象相等 if (Objects.equals(this.name, other.name) && Objects.equals(this.address, other.address)) { return true; } } return false; } @Override public int hashCode() { return Objects .hash(name, address); } }
测试代码:
package com.example.demo; import com.example.demo.dto.User; import java.util.*; import java.util.stream.Collectors; public class FirCes { public static void main(String[] args) { /*构建测试数据集合*/ User user1 = new User("a小张1", "a1", 10); User user2 = new User("b小张2", "a2", 10); User user3 = new User("c小张3", "a3", 10); User user3_3 = new User("c小张3", "a", 10); User user33 = new User("c小张3", "a3", 10); User user4 = new User("d小张4", "a4", 10); User user5 = new User("e小张5", "a5", 10); List<User> list = new ArrayList<>(); list.add(user1); list.add(user2); list.add(user3); list.add(user3_3); list.add(user33); list.add(user4); list.add(user5); //按相同name与address属性分组User用户 Map<User, List<User>> listMap = list.stream().collect(Collectors.groupingBy(v -> v)); /*先看一下分组效果*/ listMap.forEach((key, value) -> { System.out.println("========"); System.out.println("key:" + key); value.forEach(obj -> { System.out.println(obj); }); }); /*最终执行结果*/ List<User> listNew = listMap.keySet().stream().map(u -> { int sum = listMap.get(u).stream().mapToInt(i -> i.getAge()).sum(); //需要注意的是:这里也会改变原list集合中的原数据。因为这里的u分组时就是来自原集合中的一个地址对象, // 即:指向了原集合中的一个对象的地址。如果不想原集合被影响,这里可以new User()新的对象赋值并返回新对象 u.setAge(sum); return u; }).collect(Collectors.toList()); System.out.println("listNew:" + listNew); System.err.println("list:" + list); //但是一个实体类只能重写一次equals方法,如果有多种判别需求就不好满足了, // 可以定义多个不同类名相同属性的类或者下面这种方式解决 Map<String, List<User>> listMap1 = list.stream().collect(Collectors .groupingBy(v -> Optional.ofNullable(v.getName()).orElse("") + "_" + Optional.ofNullable(v.getAddress()).orElse(""))); /*先看一下分组效果*/ listMap1.forEach((key, value) -> { System.out.println("========"); System.out.println("key:" + key); value.forEach(obj -> { System.out.println(obj); }); }); /*最终执行结果*/ List<User> listNew1 = listMap1.keySet().stream().map(u -> { int sum = listMap1.get(u).stream().mapToInt(i -> i.getAge()).sum(); User user = listMap1.get(u).get(0); //这里和上面一样的原理,也会影响原list集合中的被指向的地址的对象数据 user.setAge(sum); return user; }).collect(Collectors.toList()); System.out.println("listNew1:" + listNew1); System.err.println("list:" + list); } }
打印日志:
========
key:User{name='b小张2', address='a2', age=10}
User{name='b小张2', address='a2', age=10}
========
key:User{name='c小张3', address='a', age=10}
User{name='c小张3', address='a', age=10}
========
key:User{name='c小张3', address='a3', age=10}
User{name='c小张3', address='a3', age=10}
User{name='c小张3', address='a3', age=10}
========
key:User{name='a小张1', address='a1', age=10}
User{name='a小张1', address='a1', age=10}
========
key:User{name='d小张4', address='a4', age=10}
User{name='d小张4', address='a4', age=10}
========
key:User{name='e小张5', address='a5', age=10}
User{name='e小张5', address='a5', age=10}
listNew:[User{name='b小张2', address='a2', age=10}, User{name='c小张3', address='a', age=10}, User{name='c小张3', address='a3', age=20}, User{name='a小张1', address='a1', age=10}, User{name='d小张4', address='a4', age=10}, User{name='e小张5', address='a5', age=10}]
list:[User{name='a小张1', address='a1', age=10}, User{name='b小张2', address='a2', age=10}, User{name='c小张3', address='a3', age=20}, User{name='c小张3', address='a', age=10}, User{name='c小张3', address='a3', age=10}, User{name='d小张4', address='a4', age=10}, User{name='e小张5', address='a5', age=10}]
========
key:a小张1_a1
User{name='a小张1', address='a1', age=10}
========
key:c小张3_a
User{name='c小张3', address='a', age=10}
========
key:d小张4_a4
User{name='d小张4', address='a4', age=10}
========
key:e小张5_a5
User{name='e小张5', address='a5', age=10}
========
key:b小张2_a2
User{name='b小张2', address='a2', age=10}
========
key:c小张3_a3
User{name='c小张3', address='a3', age=20}
User{name='c小张3', address='a3', age=10}
listNew1:[User{name='a小张1', address='a1', age=10}, User{name='c小张3', address='a', age=10}, User{name='d小张4', address='a4', age=10}, User{name='e小张5', address='a5', age=10}, User{name='b小张2', address='a2', age=10}, User{name='c小张3', address='a3', age=30}]
list:[User{name='a小张1', address='a1', age=10}, User{name='b小张2', address='a2', age=10}, User{name='c小张3', address='a3', age=30}, User{name='c小张3', address='a', age=10}, User{name='c小张3', address='a3', age=10}, User{name='d小张4', address='a4', age=10}, User{name='e小张5', address='a5', age=10}]Process finished with exit code 0
参考文章:
https://www.cnblogs.com/zjfjava/p/9897650.html
https://blog.csdn.net/haiyoung/article/details/80934467
https://blog.csdn.net/weixin_34185560/article/details/91464917