I realized the other day that, while I’ve been using Object#tap, Enumerator#each#with_object, and Enumerable#each_with_object for some time now, I wasn’t completely clear on the difference between these methods or how best to employ them. Mostly I just threw in Object#tap wherever I saw sandwich code and called it a day.
Time for a…
A good place to start is Ruby Docs for the method definitions:
Object#tap -
Yields self to the block, and then returns self. The primary purpose of this method is to “tap into” a method chain, in order to perform operations on intermediate results within the chain.
Enumerable#each_with_object -
Iterates the given block for each element with an arbitrary object given, and returns the initially given object.
If no block is given, returns an enumerator.
Enumerable#inject -
Combines all elements of enum by applying a binary operation, specified by a block or a symbol that names a method or operator.
If you specify a block, then for each element in enum the block is passed an accumulator value (memo) and the element. If you specify a symbol instead, then each element in the collection will be passed to the named method of memo. In either case, the result becomes the new value for memo. At the end of the iteration, the final value of memo is the return value for the method.
If you do not explicitly specify an initial value for memo, then the first element of collection is used as the initial value of memo.
Object#tap
I’ll start with this one, mostly because it’s the one that I have thrown around the most with the least amount of thought. So far I’ve mostly just used it for refactors like this: (cue extremely contrived example…)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
On a personal note, I’m feeling conflicted about this method lately. I like that it eliminates the need to explicitly return the ‘Human’ object. I like it aesthetically, it reads cleanly to my eye, but sometimes it feels a bit like using it just to use it. Maybe even a bit trivial. Not everyone instantly recognizes it and it actually seems to obfuscate your intended return value a bit. Anyway…
While the above implementation certainly works, reading the documentation it isn’t quite the use for which the method was designed.
"The primary purpose of this method is to “tap into” a method chain."
It may clean up the code but maybe the #tap
method is better reserved for cases where returning the origional object provides a bit more function:
1 2 3 4 5 6 7 |
|
Here the return value of person.age = rand(1..100)
is an integer and #can_drive?
is a method on an instance of the class Human
. Without #tap
you would not be able to chain these methods together. This feels like a better use case to me; the #tap
method serves a more significant purpose, allowing you to assign a variable name, an age, and call an instance method on a class all in one shot.
Enumerable#each_with_object
So this one is a bit different but kind of in the same vein. It’s in the Enumerable module so we know that #each_with_object
is called on some kind of collection, like an array or a hash, and iterates through the items in the collection, just like the plain-old #each
method. What puts #each_with_object
in the family of #tap
is the ...with_object
bit. It allows us to pass in an arbitrary object and return that object from the method. Where #tap
is called on the object you wish to return, #each_with_object
takes the return object as an argument. For me the importance is the concept. It couples the iteration and the return.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
So, rather than iterating through the collection with #tap
’s block we implicitly iterate through the collection. I’ve seen #tap
used this way but to my eye #each_with_object
more clearly and succinctly communicates what you seek to accomplish. If performance is critical, go ahead and use #tap
this way, otherwise maybe #each_with_object
might help out those developers coming behind you.
Alternately Enumberable.each.with_object
functions exactly the same as #each_with_object
and might even be more clear in its' link with iterating over a collection.
Enumerable#inject aka Enumerable#Reduce
This is the odd-ball of the group. The idea is that you have an object (noted as the memo_object
in the documentation) to which some sort of changes are applied based on the objects over which you are iterating. The simplest examples is addition:
1 2 |
|
#inject
comes in two flavors, taking a symbol or a block, and each of those flavors can also optionally take an initial value as an argument. If no initial value is passed, the first object in the collection becomes the initial value by default.
Passed a symbol:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Passed a block:
1 2 3 4 5 6 7 8 9 |
|
So I accidentally came up with a better example than I thought for passing a block without an initial value since it demonstrates that nothing is done to the first object in the collection. It is passed straight to the block execution for the next object in the collection as the memo object.
There’s one more layer to #inject
. The return value of the method execution is the return value of the last execution of the block, not the memo object.
1 2 3 4 5 6 |
|
Since if number != 5; sum +=5; end
returns nil when number == 5
, the last object in the collection, the return value of the entire method is nil. You could do something kinda ugly like this:
1 2 3 4 5 6 7 |
|
but at that point maybe you’re better off just using a different method like:
1 2 3 4 5 6 7 8 9 10 |
|
At least it’s a bit more obvious at first glance what you want back.
The Inspiration
Here’s the cool bit that got me thinking about these method, specifically #inject. I ran into a problem where I had a hash and I needed to return a hash with mutations applied to both the keys and values in the original hash. Changing values is easy. Changing keys takes a bit more work.
You could update the keys in place by iterate through the keys, create new mutated key and value pairs, and delete the old pair:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
You could use the Enumerable#map method to pass arrays of key value pairs to the Hash literal:
1 2 3 4 5 |
|
Or you could use the inject method to pass around a hash and merge new key-value pairs into it.
1 2 3 4 |
|
I’d be wary of mutating the original array by mutating the keys, assigning the new keys values, and deleting the old keys. If I were to use simple iteration to solve this problem I would probably instantiate a new hash and assign that hash key value pairs from within the each block. Nothing fancy, but it works.
Using the Hash literal and #map looks a bit odd to me. I haven’t seen anyone build hashes this way. The #map inside of the Hash literal is confusing unless you recall that #map always returns an array and that passing arrays of two objects to the Hash literal creates key-value pairs. This solution feels like it is asking a bit more of the next developer to come along than the other options.
Now #inject… I like this one. I like it despite the fact that IT ASKS A LOT!!! of the developers to follow and many things happens in very few lines, buuuuut it’s a pretty slick piece of code. And sometimes I am a sucker for a slick piece of code. So, what does it assume? Well, you have to know:
1. What #inject is.
2. That with #inject you have a memo object to pass to each execution of the block.
3. That the key-value pairs in toys are passed into the block as an array.
4. That you can use parenthesis and multiple assignment to make the block take only two arguments (the memo and the array containing the key and value) and instantly set the value of name and price.
5. That the return value of the #inject method is the return value of the last execution of the block, so you have to return the hash object from the block.
6. And finally, that you can add key-value pairs to a hash using the Hash#merge method, essentially taking the key-value assignment as a mini-hash and merging it into the existing hash. Maybe this would be a bit more clear: newly_priced.merge( {name.upcase.to_sym => price * markup} )
Performance
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
|
Ranked by quickest performance:
1. #each
2. #inject (1.6x #each)
3. #map/Hash literal (2.1x #each)
4. #each.with_object (3.1x #each)
The #each statement takes it! But I didn’t expect to incur such a performance penalty using #each.with_object. 3.1x! Ouch. The slick looking #inject maybe isn’t so slick, finishing in 1.6x what it took #each to complete. Maybe because #each is such a common method it has been highly optimized? I don’t have the answer. If speed is your game maybe the obvious route is the best.
Sources:
Ruby Docs - Tap Ruby Docs - Each with Object Ruby Docs - Inject “Ruby’s inject/reduce and each_with_object” - Keith R Bennet “tap vs. each_with_object: tap is faster and less typing” - Gavin Kistner “Inject vs. Each_With_Object” - Alex Wilkinson
“Ruby - #tap that!” - John Crepezzi (just glanced at this one but it looks worth a read)