# Tips and tricks for Scala collections

2021-10-22 10:43:44

# Catalog [−]

1. legend Legend
2. Combine Composition
3. side effect Side effects
4. Sequence
1. establish Creation
2. Length
3. Equivalent Equality
4. Indexing
5. Check the existence of the element Existence
6. Filtering
7. Sorting
8. Reduction
9. Rewriting
5. Set
6. Option
1. Value
2. Null
3. Rewriting
7. Map

original text : Scala Collections Tips and Tricks,
author Pavel Fatin yes JetBrains An employee of , For artifact IntelliJ IDEA Development Scala plug-in unit .
By its work Scala Collections inspections ) Inspired by the , He sorted out this about Java Collections API A list of tips .
Some techniques are only reflected in some subtle implementation details , But most techniques are common sense , But in most cases, it is ignored .
Tips and techniques for sexual harmony are valuable , Can help you understand Scala Collections, It can be that your code is faster and simpler .

## legend Legend

To make the following example easier to understand , Here are some conventions ：

• `seq` — One `Seq`-based Set , such as `Seq(1, 2, 3)`
• `set` — One `Set` example , such as `Set(1, 2, 3)`
• `array` — An array , such as `Array(1, 2, 3)`
• `option` — One `Option`, such as `Some(1)`
• `map` — One `Map`, such as `Map(1 -> "foo", 2 -> "bar")`
• `p` — An assertion `predicate` function , type `T => Boolean`, such as `_ > 2`
• `n` — An integer
• `i` — An index
• `f, g` — Simple functions , `A => B`
• `x, y` — Some literal values (arbitrary values)
• `z` — Initial value or default value

## Combine Composition

remember , Although these techniques are independent and self-contained , We can still combine them and iterate step by step to get a higher-level expression , Take the following example , In a Seq Check whether an element exists in ：

We can rely on ”substitution model of recipe application“ (SICP)) To simplify complex expressions .

## side effect Side effects

”Side effects“ Is a basic concept , There is this concept in functional programming languages . Scala There is one PPT Special introduction : Side-effect checking for Scala
Basically ,”Side effects“ It's such an action , In addition to returning a value , External functions or expressions also observe this action and one of the following behaviors :

• There are input and output operations ( Such as file , The Internet I/O)
• Changes to external variables
• The state of the external object has changed
• Throw an exception

When a function or expression has any of the above situations , Let's say it has side effects (Side effects), Or we'll say it's " pure " Function or expression of (pure).

side effects What's the big deal ？ When there are side effects , The order of calculation cannot be changed arbitrarily . For example, the following two " pure " (pure) expression :

Because they have no side effects , Two expressions can be interchanged , First `x` after `y` And first `y` after `x` The effect is the same .
If there are side effects ( There is console output )：

Two expressions cannot be interchanged , Because once you swap positions , The order of the output results has changed .
So one side effect is to reduce the number of possible conversions (reduces the number of possible transformations), Including possible simplification and optimization .
The same reason applies to Collection Related expressions . Look at an external `builder` Variable （ Side effects and methods `append`）:

In principle, `seq.filter(p).headOption` It can be simplified as `seq.find(p)`, But side effects prevent us from doing this . If you try to do this :

The result is different from the previous expression . After the previous expression evaluates, all values greater than 3 All elements have been added to `builder` It's in , The latter expression finds the first one greater than 3 No more elements will be added .
The two expressions are not equivalent .

Is automatic simplification possible ？ Here are two golden rules , It can be used in code with side effects ：

1. Avoid side effects as much as possible
2. Otherwise, the side effects will be separated from pure code

For the example above , We need to get rid of `builder` Or isolate it from pure code . in consideration of `builder` Is the object of a third party , We can't get rid of , Then we do it in isolation ：

So we can use the techniques in this article to replace ：

Well done ！ Automatic simplification is also possible , An added benefit is due to clear isolation , The code is easier to understand .
A less obvious benefit is , The code becomes more robust . The example above , Side effects target different `Seq` Realization , The results of side effects are also different , such as `Vector` and `Stream`, Side effect isolation allows us to avoid this uncertain behavior .

## Sequence

The tips in this section are for `Seq` And its subclasses , Some transformations can be applied to other collection classes , Such as `Set`, `Optio`n,`Map` , even to the extent that `Iterator` class , Because they provide similar interfaces .

### establish Creation

Show create collection

Sometimes you can save memory （ reusing empty object ） and CPU (length check waste ).

It can also be applied to `Set`, `Option`, `Map`, `Iterator`.

### Length

#### For arrays , priority of use `length` instead of `size`.

`length` and `size` Basically synonymous . stay Scala 2.11 in ,`Array.size` It is realized by implicit conversion . So every time you call , An intermediate wrapper class will be created , Unless you allow jvm Of escape analysis . This will produce redundant GC object , Affect performance .

#### Don't check empty The property of is reversed

The same applies `Set`, `Option`, `Map`, `Iterator`

#### Don't calculate length To check empty

On the one hand, there have been inspections empty Methods , On the other hand ,, such as `LinearSeq` And subclasses `List`, It will cost `O(n)` The time calculation `length`(`IndexedSeq` cost `O(1)`).

The same applies `Set`, `Map`

#### Don't use... Directly `length` To compare

Same as above , Calculation `length` Sometimes it's very expensive , It is possible to spend from `O(length)` Reduced to `O(length min n)`.
For infinite stream Come on , The above skills are absolutely necessary .

### Equivalent Equality

#### Do not use `==` Compare arrays ：

because `==` Just compare instance objects , Not the elements inside .

The same applies `Iterator`

#### Do not check equality manually

Use the built-in method .

### Check the existence of the element Existence

#### Don't use assertions equality predicate To check for the presence of

The same applies to `Set`, `Option`, `Iterator`

#### Don't use assertions inequality predicate To check that it doesn't exist

The same applies to `Set`, `Option`, `Iterator`

#### Do not count the number of elements to check for the presence of

The same applies to `Set`, `Map`, `Iterator`

#### Don't use `filter` To check for the presence of

The same applies to `Set`, `Option`, `Map`, `Iterator`

### Filtering

#### Don't negate assertions

The same applies to `Set`, `Option`, `Map`, `Iterator`

#### Don't use `filter` Number of statistical elements

call `filter` A temporary collection will be generated , influence GC And performance .

The same applies to `Set`, `Option`, `Map`, `Iterator`

#### Don't use `filter` Find the first value of the element

The same applies to `Set`, `Option`, `Map`, `Iterator`

### Reduction

#### Don't calculate by hand sum

Other possible methods `reduceLeft`, `reduceRight`, `foldLeft`, `foldRight`

The same applies to `Set`, `Iterator`

#### Don't calculate by hand product

The same applies to `Set`, `Iterator`

The same applies to `Set`, `Iterator`

#### Don't copy `forall`

The same applies to `Set`, `Option` (for the second line), `Iterator`

### Rewriting

#### Merge successive `filter` call

or `seq.view.filter(p1).filter(p2).force`

The same applies to `Set`, `Option`, `Map`, `Iterator`

#### Merge successive `map` call

or `seq.view.map(f).map(g).force`

The same applies to `Set`, `Option`, `Map`, `Iterator`

#### Don't copy `slice`

The same applies to `Set`, `Map`, `Iterator`

#### Don't copy `flatten`

The same applies to `Set`, `Map`, `Iterator`

#### Don't copy `flatMap`

The same applies to `Set`, `Option`, `Iterator`

#### Don't use... When you don't need results `map`

The same applies to `Set`, `Option`, `Map`, `Iterator`

#### Do not generate temporary collections

1. Use view

1. take view Convert to a collection of the same type

If the intermediate conversion is filter, just so so

1. take view Convert to another set

There is another kind. `“transformation + conversion”` Method ：

#### Use the assignment operator

Scala There is a grammar sugar , Automatically put `x <op>= y` convert to `x = x <op> y`. If `op` With `:` ending , Is considered to be a right associative operator .
some list and stream The grammar of ：

The same applies to `Set`, `Map`, `Iterator`

## Set

Most of `Seq` This technique can also be applied to `Set`. Others are only for `Set` The technique of .

### Do not use `sameElements` Compare unordered sets

The same applies to `Map`

## Option

`Option` Not a collection class , But it provides similar methods and behavior .
Mostly for `Seq` This technique also applies to `Option`. Here are some special ones for `Option` The technique of .

### Value

#### Do not use pattern matching to check the existence of values

The same applies `Seq`, `Set`

## Map

ditto , Only for map The technique of

### Do not use `lift` Replace `get`

Because there is no special need to map The value is converted to a Option.

### Careful use `filterKeys`

because `filterKeys` Wrapped the original collection , No copied elements , Follow up treatment should be careful .

ditto .

### Use the assignment operator to reassign

In addition to the above introduction , I suggest you look at the official documents Scala Collections documentation.

also

The last paragraph is the author's modest words , Comments and suggestions are welcome .

https://chowdera.com/2021/10/20211009000611613w.html