layout: post title: “Parallel Stream的使用实践你真的掌握了？” subtitle: “When and How to Use Parallel Stream” date: 2019-09-24 author: S.L header-img: img/home-bg-o.jpg catalog: true tags: - Java —
Brian Goetz大神的Understanding Parallel Stream Performance in Java SE 8
it is a bad idea to just drop .parallel() all over the place simply because you can.
A parallel execution will always involve more work than a sequential one, because in addition to solving the problem, it also has to perform dispatching and coordinating of sub-tasks.
Further, note that parallelism also often exposes nondeterminism in the computation that is often hidden by sequential implementations; sometimes this doesn’t matter, or can be mitigated by constraining the operations involved (i.e., reduction operators must be stateless and associative.)
Moreover, remember that parallel streams don’t magically solve all the synchronization problems. If a shared resource is used by the predicates and functions used in the process, you’ll have to make sure that everything is thread-safe. In particular, side effects are things you really have to worry about if you go parallel.
It is best to develop first using sequential execution and then apply parallelism where
(A) you know that there’s actually benefit to increased performance and
(B) that it will actually deliver increased performance.
(A) is a business problem, not a technical one. If you are a performance expert, you’ll usually be able to look at the code and determine (B), but the smart path is to measure. (And, don’t even bother until you’re convinced of (A); if the code is fast enough, better to apply your brain cycles elsewhere.)
I would use sequential streams by default and only consider parallel ones if
I have a massive amount of items to process (or the processing of each item takes time and is parallelizable)
I have a performance problem in the first place
I don’t already run the process in a multi-thread environment (for example: in a web container, if I already have many requests to process in parallel, adding an additional layer of parallelism inside each request could have more negative than positive effects)