Java - Filter A Stream With Lambda Expressions

Taasavaldanud Platon

järgijaid: 0

Java Streams have been introduced all the way back in Java 8 in 2014, in an effort to introduce verbose Java to a Functional Programming paradigm. Java Streams expose many flexible and powerful functional operations to perform collection processing in one-liners.

Filtering collections based on some predicate remains one of the most commonly used functional operations, and can be performed with a Predicate or more concisely – with a Lambda väljendus.

In this short guide, we’ll take a look at how you can filter a Java 8 Stream with Lambda Expressions.

Filtering Streams in Java

Üldiselt mis tahes Stream can be filtered via the filter() method, and a given predicate:

Stream filter(Predicate<? super T> predicate)

Each element in the stream is run against the predicate, and is added to the output stream if the predicate returns true. You can supply a Predicate näide:

Predicate contains = s -> s.contains("_deprecated");
List results = stream.filter(contains).collect(Collectors.toList());

Or, simplify it by providing a Lambda Expression:

List results = stream.filter(s -> s.contains("_deprecated"))
                             .collect(Collectors.toList());

Or even collapse the Lambda Expression into a method reference:


List results = stream.filter(String::isEmpty)
                             .collect(Collectors.toList());

With method references, you can’t pass arguments, though, you can define methods in the object you’re filtering and tailor them to be easily filterable (as long as the method doesn’t accept arguments and returns a boolean).

Pea meeles, et streams are not collections – they’re streams of collections, and you’ll have to collect them back into any collection such as a List, Map, etc. to give them permanence. Additionally, all operations done on stream elements either kesktaseme or terminal:

Intermediate operations return a new stream with changes from the previous operation
Terminal operations return a data type and are meant to end a pipeline of processing on a stream

filter() on kesktaseme operation, and is meant to be chained with other intermediate operations, before the stream is terminated. To persist any changes (such as changes to elements themselves, or filtered results), you’ll have to assign the resulting väljundvoog to a new reference variable, through a terminal operation.

Märge: Even when chaining many lambda expressions, you might not run into readability issues, with proper linebreaks.

In the following examples, we’ll be working with this list of books:

Book book1 = new Book("001", "Our Mathematical Universe", "Max Tegmark", 432, 2014);
Book book2 = new Book("002", "Life 3.0", "Max Tegmark", 280, 2017);
Book book3 = new Book("003", "Sapiens", "Yuval Noah Harari", 443, 2011);
        
List books = Arrays.asList(book1, book2, book3);

Filter Collection with Stream.filter()

Let’s filter this collection of books. Any predicate goes – so let’s for example filter by which books have over 400 pages:

List results = books.stream()
                          .filter(b -> b.getPageNumber() > 400)
                          .collect(Collectors.toList());

This results in a list which contains:

[
Book{id='001', name='Our Mathematical Universe', author='Max Tegmark', pageNumber=432, publishedYear=2014}, 
Book{id='003', name='Sapiens', author='Yuval Noah Harari', pageNumber=443, publishedYear=2011}
]

When filtering, a really useful method to chain is map(), which lets you map objects to another value. For example, we can map each book to its name, and thus return only the nimed of the books that fit the predicate from the filter() helistama:

List results = books.stream()
                            .filter(b -> b.getPageNumber() > 400)
                            .map(Book::getName)
                            .collect(Collectors.toList());

This results in a list of strings:

[Our Mathematical Universe, Sapiens]

Filter Collection on Multiple Predicates with Stream.filter()

Commonly, we’d like to filter collections by more than one criteria. This can be done by chaining multiple filter() kõned or using a short-circuit predicate, which cheks for two conditions in a single filter() helistama.

 List results = books.stream()
                    .filter(b -> b.getPageNumber() > 400 && b.getName().length() > 10)
                    .collect(Collectors.toList());
                    


 List results2 = books.stream()
                    .filter(b -> b.getPageNumber() > 400)
                    .filter(b -> b.getName().length() > 10)
                    .collect(Collectors.toList());

Tutvuge meie praktilise ja praktilise Giti õppimise juhendiga, mis sisaldab parimaid tavasid, tööstusharus aktsepteeritud standardeid ja kaasas olevat petulehte. Lõpetage Giti käskude guugeldamine ja tegelikult õppima seda!

When utilizing multiple criterions – the lambda calls can get somewhat lengthy. At this point, extracting them as standalone predicates might offer more clarity. Though, which approach is faster?

Single Filter with Complex Condition or Multiple Filters?

It depends on your hardware, how large your collection is, and whether you use parallel streams or not. In general – one filter with a complex condition will outperform multiple filters with simpler conditions (small-to-medium collections), or perform at the same level (very large collections). If your conditions are too long – you may benefit from distributing them over multiple filter() calls, for the improved readability, since performance is very similar.

The best choice is to try both, note the performance on the sihtseade, and adjust your strategy accordingly.

GitHubi kasutaja volkodavs did a filtering benchmark in throughput operations/s, and hosted the results on the “javafilters-benchmarks” repository. The results are summarized in an informative table:

It shows a clear diminishing of returns at larger collection sizes, with both approaches performing around the same level. Parallel streams benefit significantly at larger collection sizes, but curb the performance at smaller sizes (below ~10k elements). It’s worth noting that parallel streams maintained their throughput much better than non-parallel streams, making them significantly more robust to input.

Ajatempel: Oktoober 7, 2022Oktoober 29, 2022