Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base_Tests slowed down by 4min(!) due to too frequent get_source_location calls #12215

Open
JaroslavTulach opened this issue Feb 3, 2025 · 10 comments
Assignees
Labels
--low-performance -compiler -libs Libraries: New libraries to be implemented

Comments

@JaroslavTulach
Copy link
Member

Running test/Base_Tests currently (2025.1.1-nightly.2025.1.30) takes six minutes:

sbt:enso> runEngineDistribution --run test/Base_Tests
                                                                      
2944 tests succeeded.
0 tests failed.
48 tests skipped.
17 groups skipped.

[success] Total time: 375 s (06:15)

most of the time seems to be taken by calls to Test.get_source_location which calls Runtime.get_stack_trace since #12031. If I comment that code out:

diff --git distribution/lib/Standard/Test/0.0.0-dev/src/Test.enso distribution/lib/Standard/Test/0.0.0-dev/src/Test.enso
index e565209131..fa218849f4 100644
--- distribution/lib/Standard/Test/0.0.0-dev/src/Test.enso
+++ distribution/lib/Standard/Test/0.0.0-dev/src/Test.enso
@@ -226,7 +226,4 @@ type Test
         need to be skipped will not change between versions, so this method should be
         used carefully.
     get_source_location skip_frames:Integer -> Text =
-        stack = Standard.Base.Runtime.get_stack_trace
-        element = stack . at skip_frames+1
-        loc = element . source_location
-        loc . to_display_text
+        "<unknown>"

the whole execution speeds up to 2min 20s!

  • either we need a way to speed Test.get_source_location up
  • or we need to stop calling it so frequently

CCing @radeusgd as we already had a discussion about this issue:

the whole point of get_source_location is we jump a few frames

That cannot be fast operation, alas. Even if we return back the Meta.get_source_location builtin (removed in #12031), it is not going to be too fast. We can take current location - Error.throw "here" that's fast - but capturing the stack shall only be done when the test really fails.

@JaroslavTulach JaroslavTulach added --low-performance -compiler -libs Libraries: New libraries to be implemented labels Feb 3, 2025
@github-project-automation github-project-automation bot moved this to ❓New in Issues Board Feb 3, 2025
@JaroslavTulach JaroslavTulach changed the title Base_Tests slowed down by **4 minutes** due to too frequent get_source_location calls Base_Tests slowed down by 4min(!) due to too frequent get_source_location calls Feb 3, 2025
@radeusgd
Copy link
Member

radeusgd commented Feb 3, 2025

That cannot be fast operation, alas.

What was the overhead before the builtin was removed?

@radeusgd
Copy link
Member

radeusgd commented Feb 3, 2025

But yeah so as we discussed we probably want to migrate towards fetching the location lazily only upon failure.

The offsets changing and breaking the location code often was the issue why the location was computed more eagerly - at the beginning of the method it was easier to get stable offsets.

As @JaroslavTulach we probably want to make the location lazy again and find the caller by something like:

Runtime.get_stack_trace . drop (..While (f-> !f.starts_with "Standard.Test")) . drop 1

although I think f.starts_with "Standard.Test" and drop 1 will not be enough as the 'lazy location' can be several frames deep inside of the function. We probably need to find the outermost frame of the current function and take the next frame (plus any frame offsets).

@JaroslavTulach
Copy link
Member Author

20s

Computing locations in Date_Spec.spec_with.shifts takes 20s by itself!

@GregoryTravis GregoryTravis self-assigned this Feb 4, 2025
@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Feb 4, 2025

This is my idea how to "anchor a location" and find it in a stacktrace that's allocated later:

from Standard.Base import all

type Where
    Loc ~err ~run

main =
    v = [1, 2, 3 ]

    action anchor = 
        v.map x->
            if x < 3 then x else
                anchor_frame = anchor.err.stack_trace.at 0
                anchor_name = anchor_frame.name.take (..While (ch-> ch!='<'))
                whole = Runtime.get_stack_trace
                stack = whole.drop (..While (frame-> anchor_name != frame.name))
                
                IO.println "=== current stack is: ==="
                whole.map IO.println
                
                IO.println "=== simplified stack is: ==="
                stack.map IO.println

                x

    where = Where.Loc (Error.throw "Here") (action where)
    where.run
    Nothing

the output is:

=== current stack is: ===
(Stack_Trace_Element.Value 'w.main.action.case action<arg-2>' (Source_Location w.enso:14:25-47))
(Stack_Trace_Element.Value 'w.main.action.case action' (Source_Location w.enso:11-23))
(Stack_Trace_Element.Value 'Array_Like_Helpers.map.Array_Like_Helpers.map' (Source_Location built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/lib/Standard/Base/0.0.0-dev/src/Internal/Array_Like_Helpers.enso:232:56-77))
(Stack_Trace_Element.Value 'w.main' (Source_Location w.enso:26:5-13))
=== simplified stack is: ===
(Stack_Trace_Element.Value 'w.main' (Source_Location w.enso:26:5-13))

e.g. even the failure happens inside of .map closure, the code can locate the w.main function and drop the stacktrace above w.main.

@GregoryTravis GregoryTravis moved this from ❓New to 📤 Backlog in Issues Board Feb 4, 2025
@GregoryTravis GregoryTravis moved this from 📤 Backlog to 🔧 Implementation in Issues Board Feb 4, 2025
@enso-bot
Copy link

enso-bot bot commented Feb 4, 2025

Greg Travis reports a new STANDUP for today (2025-02-04):

Progress: - slow test locations

  • tested 'anchor' method
  • integrating it into extension methods It should be finished by 2025-02-06.

Next Day: default paramters

@enso-bot
Copy link

enso-bot bot commented Feb 5, 2025

Greg Travis reports a new STANDUP for today (2025-02-05):

Progress: - anchor working for regular and nested calls

  • comparing full location dump in Vector_Spect (563 tests)
  • currently there are extra locations logged that shouldn't be there; the goal is that the two location dumps are identical, for before and after the change It should be finished by 2025-02-06.

Next Day: make location dumps equal for entire test suite

@JaroslavTulach
Copy link
Member Author

JaroslavTulach commented Feb 6, 2025

anchor working for regular and nested calls

Does it make test execution fast(er)?

@JaroslavTulach
Copy link
Member Author

I am running test/Table_Tests and it is taking long time. That's not surprising when 70 seconds is spent by taking the location:

70s in source location

the goal is that the two location dumps are identical

Given the current situation I might even accept the locations to not be 100% correct, but the execution to be fast...

@enso-bot
Copy link

enso-bot bot commented Feb 6, 2025

Greg Travis reports a new STANDUP for today (2025-02-06):

Progress: - switched to a simpler method that uses known function names instead of an anchor

  • verified correct for all test utils that don't use callback
  • verified faster
  • determined that remaining callback cases depend on whether the callback takes arguments or not It should be finished by 2025-02-07.

Next Day: fix callback cases

@enso-bot
Copy link

enso-bot bot commented Feb 7, 2025

Greg Travis reports a new STANDUP for today (2025-02-07):

Progress: - working to get callback behavior correct It should be finished by 2025-02-10.

Next Day: fix callback cases; test through induced errors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
--low-performance -compiler -libs Libraries: New libraries to be implemented
Projects
Status: 🔧 Implementation
Development

No branches or pull requests

3 participants