Now I’m starting to get into the deep dark corners of flay. The #process_sexp
method is the next step in the process.
The Code
1 2 3 4 5 6 7 8 |
def process_sexp pt pt.deep_each do |node| next unless node.any? { |sub| Sexp === sub } next if node.mass < self.mass_threshold self.hashes[node.structural_hash] << node end end |
Review
#process_sexp
is running a collection routine to store the parts of each s-expression into the shared hashes
data structure. It starts by recursively applying the block to each s-expression using #deep_each
. This should make sure that each section of code is looked at for duplication:
- the class as a whole
- the methods in the class
- the code inside a method
- the single line of code
- the atoms of each Ruby statement
Each of those gets yielded to the block in #process_sexp
which does two checks before adding it to the hashes
.
1 |
next unless node.any? { |sub| Sexp === sub } |
This is checking if the current node (s-expression) includes only other s-expressions.
1 |
next if node.mass < self.mass_threshold |
This check is to make sure that only nodes above the mass threshold are included in the report. Flay defaults this to 16 but includes an option to change it.
1 |
self.hashes[node.structural_hash] << node |
Finally, if both those checks above pass, the node is added to the shared hashes
structure. From what I see node.structural_hash
is a simplified version of the structure that just holds the s-expression types. I think this is what lets flay track how many times a statement was used, similar statements are simplified down to the same structural hash.
If anyone has anything they want to add, please post a comment below. Parsing and s-expressions are not my strengths so I might be totally reading this wrong.