Jase00's Recent Forum Activity

  • I was sure this was posted on the current suggestions github, but haven't spotted it yet (link it here and I will edit post and add a comment there, or I will post it myself and link it). I know I posted it years back, and could have sworn I saw it again this year!

    AND yes, sorry, I know the point of the features suggestions github is for exactly this, but I guess it's random hope after seeing the exciting recent UI changes in Construct 3. No demands here!

    Object Icons

    Objects, such as Sprites, TiledBG, they auto-generate an icon for themselves - Very helpful visually when scrolling through event sheets.

    Other object types, such as JSON, Dictionary, Array, they have the same icon. Not as visually identifiable, or sometimes impossible to identify if you choose to set the Event Sheet UI to hide the object's name partially.

    If ANY object type, including Sprites and Families, could have a dedicated "icon" section, two buttons like "Use default icon" and "Edit" which opens the image editor window to let you draw an icon... wow... this would save so many mistakes I continue to make, despite my lengthy time using Construct! Honestly I'd take a pre-defined list of icons at this point.

    Why does it matter?

    Lets say we have many Array object types: Sure, you are careful when starting off, but when in a deep flow state, and you are working on events that involve 2 or more Arrays, you may unintentionally make mistakes with logic, maybe the name of the object is long, or starts similarly like "jsonUI_Window", "jsonUI_Button".

    "get gud, read more", I DO, but Idunno, I comprehend things visually, the very reason I enjoy Event Sheets so much.

    Workarounds

    I've adapted over the years to make this as smooth as possible, such as renaming objects to be obvious earlier in the object's name (e.g. "jWinUI", "jButUI") although it makes things less organised/tidy, and I still make the same mistakes, especially when all names are shorter and thus looking for 2 or 3 character differences.

    I also tried to keep things in less object types, keep things as 1 array with a few instances with an instance var to manage types, but then this has other issues - both practically and using C3's features like find/find all references when there's hundreds of results.

    Anyway, yes... Thoughts?

  • I keep thinking and reflecting on all this stuff (as you'd imagine from my walls-of-text).

    It can also seem like Scirra are being closed-minded or something, but I think there's more to it.

    Everything in this topic is not relevant to most projects.

    It is mostly relevant to those who are making an ambitious project, or their project includes an ambitious system.

    An ambitious system could be desiring 500 active enemies, each behaving independently with path finding and various variables to "think" about their behaviour. A new user will hit performance issues in so many ways where other experienced users would have an easier time achieving this.

    But, if you're making a smaller game, or even a platformer with huge levels with many interactive objects, but no extreme systems, then usually this is achievable for new users, simply by doing the "intuitive" thing and throwing a layout together (and C3 works it magic with Collision Cels and such).

    Is it the optimal way to go? Maybe, maybe not. Does it matter? That's Scirra's point that is totally fair for general games. If you have an ambitious game or wish to target weak devices, then both require the will to learn/measure.

    If someone is new to Construct, Tech, or GameDev, then they will struggle to do ambitious ideas, until they learn and read and experiment.

    Take "picking" - Experienced people know the ins/outs of this, whereas newcomers, even those that used a programming language before, find "picking" to be alien/unintuitive/confusing.

    Whereas for me who has used Construct for nearly 2 decades, picking is the thing I have 0 issues with, even with the "quirks", which to me, no "quirk" in this area has prevented/stopped me/slowed me down from proceeding, we have many tools to manage picking - but when I was less experienced, you bet I was frustrated; looking back, it felt unfair to be frustrated, as the tools and documentation are provided for me to learn. It's easy for me to get stuck in my way of thinking.

    Some of this feels true with nerdError 's post:

    Even though I'm a big fan of C3, I really hate all the performance quirks it has. And you can mostly only find them by extensive testing and measurements (which with addition of all the random fluctuations and hardware dependency can be very time consuming and just lead to nothing). Well, also there is good amount of bugs too

    It feels like this implies there's a lot of performance quirks, but I disagree here.

    And "Extensive testing/measurements" - It's not like measurements take hours to achieve; New project, throw a sprite that "moves" (e.g. Sine behaviour) so FPS is not at 0, add your measurement tests (If needing to test many objects, a quick "For" loop to spawn many). Then test.

    One would only test if they've thrown something into their project and noticed lag/FPS drops/CPU spikes. Or, if someone has a system that works fine but they yearn for more.

    I do agree with the confusion being brought here with "how" to measure, but there's info here on some preferable methods. I also think that some measurements will give obvious results regardless of the CPU load and CPU speeds. Granular measurements are trickier, maybe the "I get X% CPU with 1000 objects, can I get a 5% CPU gain from another method with 1000 objects?" type of tests. But that kind of test is pointless; If you get your 5% CPU boost, then in-game when there's maybe commonly only 250 objects and rare moments of 1000, then the 250 object moments had gained maybe less than 1% CPU.

    But in C3 you can ruin optimization of the game with so many random small things, and it explained nowhere. Docs are really bad in that. There is so much hidden things that are never explained, and they matter like A LOT, both for how your game works and how it performs.

    With experience, it doesn't feel like "so many random small things". A lot of it is logical imo, and I'm sure most agree. E.G. Putting "For Each" as the top-level condition. Admittedly, I used to do this for my projects, but wasn't thinking logically; I just followed the rule of "Don't use them much", so I thought a single top-level "For Each" was ideal, but this was wrong. IIRC it is explained in the documentation for For Each.

    But I do somewhat agree with a few random small things affecting performance for an ambitious project or system; the act of just using "Evaluate expression" within a "For Each" to gain some CPU is wild. But I say "somewhat agree" because it makes logical sense; "Sprite > Compare Variable" would do picking under the hood, despite 1 Sprite to pick, and it would be a tiny insignificant thing that C3 does, but could add up if designing an ambitious system with 500 enemies with subevents/conditions below the "For Each".

    Also events have so much overhead just by default.

    It's known this is true, but again, not for general projects, just ambitious things - The great thing about this topic is finding ways to get the performance out of event blocks.

    Without that, it just a complete guess game and so many wasted hours

    Well it's not a guessing game - Measure it! I also have wasted hours in the past, especially by trying to "measure" within the main project, rather than making a blank new project.

    For example, just changing 9patch settings "Edges" and "Fill" to "Tile" leads to CRAZY performance cost. With 100 9patch objects, the amount of draw operations increased from 23 to... 1213, that's a 5173% increase just from changing a few settings on the objects that doesn't seem like they matter at all. Is it mentioned anywhere? Of course not

    I think this fair to bring up but not an overall fair example. For one, 9 Patch setting changes are brand new from about a month ago - If there's performance issues, report it, and if it's by design or impossible to resolve, often a documentation tweak occurs.

    And 2nd, Ashley highlighted, so won't repeat, but it's about thinking of the logic behind something - a 9-patch change is likely going to do something to the 9 pieces of texture it contains (except for the common Set Size and such).

  • ...in the case of "has tags", in one case you just compare a string which is a very simple and quick operation for a CPU. On the other hand "has tags" has to split the given string by spaces to extract individual tags, and then verify that all the provided individual tags are in the set of tags for the given instance. It's probably at least 10x as much work as just comparing a string. It's not that it's slow...It's just that you've used a feature which necessarily includes more complex steps. So if you make a benchmark that absolutely hammers that specific feature, you will probably see something like a 10x difference.

    This is a good example of understanding "under the hood" a bit more so that we can make decisions on what to use/how to structure things.

    Typical usage for any structure is going to perform great, but using "loops" or "many instances", it can be desirable to explore a different approach or understanding as many optimisation tips as possible (many small optimisations for a loop will add up).

    I too went to use tags as a form of identifying many instances within loops, did measurements, and opted not to (not really understanding why it gave higher CPU than other methods). With your insights, had I wrote a bug report about Tag performance, it would have cost my time and Scirra's time to do the bug report process to ultimately end with "this is by design".

    Without insights, I assumed tags was like a small dictionary hidden within instances and "has tags" automates a dictionary "has key" loop and picks objects; Didn't know that it always splits the string when using "has tags" (I've come to find heavy loops with "tokenat()" can be heavy, and try to use "Array Split String" outside of a loop when possible).

    I suppose this means it's ideal to not put "Has tags" in a heavy loop, and if wanting to do different things to many instances with various tags, keep it base level with a handful of "Has tags X", "Has tags Y", and each event can have For Each after this condition, if required (often not required).

  • Try Construct 3

    Develop games in your browser. Powerful, performant & highly capable.

    Try Now Construct 3 users don't see these ads
  • So I got unhealthily curious about CPU power management.

    [Note: I oversimplify CPU talk here. It's not all about CPU Speeds/GHz, it's also about type of CPU, generation, age, etc.]

    Turns out you can control your processor speed in Windows. Not explaining how. Your own risk if you explore this.

    Doing this stabilised my CPU's clock speed. At worst, only changing +-0.01, but frequently staying completely static.

    This gave far more stable C3 readings, bit jumpy with unlocked framerate but less range with the jumpy numbers.

    This guarantees CPU readings aren't affected by CPU power shenanigans, but is NOT a real world scenario, just an experiment to rule-out CPU power management.

    Maxing out CPU caused unpredictable CPU speed, doing tests that were 10%, 30%, etc, all affected CPU speed. Indeed, CPU is affecting our results.

    The way we measure now, if you let your CPU do it's own thing and fluctuate as workload gets higher, then:

    1. Running one test at a time, your CPU would reach differing speeds during each test. Less accurate results. (I did this method for many months).

    2. Doing tests with "2 Groups enabled and compare CPU Profiler", CPU gets even higher than doing singular tests (The difference in CPU will also be less clear, explained below).

    Notably, the higher your CPU speed, the harder it is to observe a CPU difference between two tests. No use fluffing up both tests with a "For" loop, as CPU will just boost its speed and further decrease the CPU difference. If you had an imaginary 100GHz CPU, none of our measurements would even show a 0.1% difference and we wouldn't know there was a difference that could affect the average player with a 2.5GHz CPU. (Always test on lower-powered devices if you aim to target them).

    Measurements with fixed CPU speed

    I took the same example I posted in this topic: "Dummy Return function" vs "For Each loop", 1000 objects every tick, one test at a time.

    Had to approximate measurements still, since it's never a perfectly-stable number (esp with higher GHz).

    I also tested various clock speeds, in all 3 Framerate modes (VSync, Ticks Only, Full Frames) and noted down the CPU % for VSync, TicksPerMin for Ticks Only, and FPS for Full Frames.

    Why did I do this? I don't know. Maybe someone finds it interesting or can extract some information from it, knowing it's a fixed stable CPU speed on every measurement.

    --- No CPU Restriction ---

    (Idle with preview open using 2% CPU, GHz floats between 4.15GHz to 4.21GHz).

    (Very approximate as CPU speed changes often).

    For Each

    CPU: 13%

    TPM: 1000 [~4.24GHz]

    FPS: 940 [~4.22GHz]

    Return Fn

    CPU: 8%

    TPM: 2250 [~4.24GHz]

    FPS: 1840 [~4.22GHz]

    --- 2.28GHz ---

    For Each

    CPU: 20%

    TPM: 560

    FPS: 545

    Return Fn

    CPU: 12%

    TPM: 1220

    FPS: 1000

    --- 1.48GHz ---

    For Each

    CPU: 30%

    TPM: 360

    FPS: 345

    Return Fn

    CPU: 17%

    TPM: 760

    FPS: 670

    --- 0.98 GHz ---

    For Each

    CPU: 43%

    TPM: 225

    FPS: 218

    Return Fn

    CPU: 25%

    TPM: 475

    FPS: 420

    The kicker? This is all absolutely pointless lol. Regardless of the CPU doing its thing and adapting, or a fixed CPU speed, I still have the same lesson I learned from when I whipped up the original c3p file in 5 minutes.

    With my ignorance these past months, I measured mainly by comparing CPU, running 1 test at a time, re-running a few times to see if any sudden differences occur.

    Any test I have done, has benefited me in my C3 journey:

    - Placing conditions in ideal ways

    - Knowing when/when not to use conditions/actions

    - Knowing the expensive actions

    - Finding that "comparing longer strings can increase CPU insigificantly, but can add up qucikly if reading many long strings in large loop, so opt for a number or shorter string".

    - Learning that getting a value from a deeply-nested JSON gets expensive quickly, not for 1 "Get" but observable in a loop - but keeping nests minimal with hundreds of keys is fine.

    - JSON performs the worst in many read/write/iterate tests when compared with Array/Dictionary/JSON, but is is still powerful and an ideal way to store complex relational data.

    - Understanding "Pick by UID" or "Pick Child/Parent" is the categorical optimal way to pick an object, again, especially if in a long loop - Instance Variables could eat away at CPU... (But now there's exploring "Evaluate expression" which may render Instance Variables more ideal to use in a busy loop

    These all feel accurate, and if inaccurate or misremembering... Can go and measure.

  • Thank you Ashley.

    I spent a lot of time measuring, I kept in mind the very quote in official performance advice: "Measure measure measure"...

    I thought this thread was great. No guessing, folks sharing screenshots and project files. I've remade some tests and observed similar results.

    FWIW even if some things learned here aren't ideal in most cases, like usage of "evaluate expression", it's now another direction to consider if looking to optimise. I knew of things, but didn't expect the results in the specific cases shown.

    It's unfortunate we couldn't learn some confusingly-interesting insights about some mysteries of results; never an obligation, just strong curiosity from us all. I feel the strong pushback against measurements was surprising considering project files were shared for people to check themselves, but presumably you are considering general users that see this topic/the thread seemed too authoritative on how to design events when they could trip up others.

    ... and all modern processors have advanced power management. This means it's possible to make a benchmark and slowly increase the amount of work, and then at some point the CPU/GPU measurement suddenly drops a lot. You might think "OMG! Adding more work made it faster, WTF?" but all that happened is you created enough work for the processor to step up from slow, low-power mode to fast, high-power mode.

    I am not as familiar with CPU power management (is it just the GHz reported by your CPU or is there more to it?), but observed my CPU GHz during tests and see it rise and lower relative to CPU usage of C3, although flickers +-0.2 whether at 10% or 100% CPU with FPS dropping to 10%

    Regardless of, looking at two tests running, the difference in CPU between the two tests maintains a fairly consistent distance, even if inflating tests fairly with a For loop. It feels logical to explore the winning test, even if 5% or 10% difference (ofc depending on the initial test). Turning off VSync and running 1 test at a time would be further confirmation, surly?

    Admittedly I often go with just CPU comparisons that don't max out, it tends to work for me when implemented into real world project. E.G. the "dummy return function" I implemented for a system I've been trying to solve for months, from 10% down to 2% CPU (setting opacity of about 200 objects with a wacky equation pulling from various places related to that specific instance - For Each was my go-to method most of the time).

    It's a presumption but I figured the types of optimisations being looked at in this thread are more for event systems that can "grow" (i.e. wanting more enemies active or something, the more the better). Sure it's working great with 50 or 100 enemies maybe at 30% CPU (knowing you need to keep things lite for other gameplay CPU tasks), but then to find that using a different order of events/different condition could grant an extra 100 enemies or more without affecting anything, is curious to learn about.

    Granted some benchmarks are overkill, and it's easy to be quick to think "HEY this 10000 iteration loop is slow!". But I think there's been some realistic examples in this thread.

    I'd also note running events once with N instances picked is almost always more efficient than running events N times with one instance picked, because the latter repeats the overhead of the event engine.

    Can't deny I'm curious why the "dummy return function" is more optimal than a single event with a "For Each" loop (I posted earlier on this thread), though I'm certain it relates to what you noted here. I presumed calling a return function when many instances of a Sprite are picked, would be calling a function many times, plus running the return function's "pick by UID" and such (altho fast to pick by UID, I thought it's still "more work" dealing with a function and an extra condition). It's not like functions behave like custom actions and pass the SOL, so... No clue!

    Even Construct's event blocks, with the overhead of the event system, is so fast that one benchmark we did a few years ago showed Construct events being 5x faster than GameMaker Language in VM mode, and still nearly as fast as GML when compiled to C++!

    This was a cool benchmark and never left my mind!

    Causing people to do contrived, inconvenient things to their projects that are entirely unnecessary.

    It'd be a shame to not see this type of thread pop up if folks worry about leading people down the wrong path - it's really engaging, but I understand the concern if that's the case. Hopefully if anyone shares something, they write a warning, and I'd like to think newcomers wouldn't be racing to learn about optimisations and ignore the overwhelming confusion of this thread until they've learned more

    This kind of thing comes up so much and seems to mislead people so consistently, even after explaining the pitfalls, that I do wonder if it would be better just to get rid of Construct's timer-based CPU/GPU readings. However they are still useful for general "is it low or high" type readings, for example knowing whether you've maxed it out or not, so I don't think we should get rid of them.

    I know you said you won't remove it, but - I LIVE in that profiler!! It has guided me to mistakes so often. I've optimised 16k event project from 40% CPU down to 17%. It's helped tremendously with measurements. Even if I'm doing it wrong, whatever I measure and feel is optimal, I implement, CPU goes down - This helps reassure me my game will run well on low-end devices, even if a little bit more. (Gotta get the old integrated graphics laptop out - worked moderately well before, must be great now!)

  • Jase00 The nesting of the events doesn't seem to affect the speed significantly.

    So far the fastest way to do things is to have the loop as the last thing after picking stuff, or you can use compares after the loop as long as they are in sub-events.

    Thank you for measuring. I was so certain I've read in the past about nesting subevents being a crux, but maybe it was a very long time ago... or I imagined it... Cramming in so many tidbits, maybe misremembering a few!

    It's great to know, glad I don't need to investigate and refactor things related to this!

  • What are the chances that the way the system handles these comparisons under the hood, which currently leads to these results, could change—so that another approach becomes more efficient, or that what is efficient now simply turns into the least efficient? What are the patterns? Do they exist?

    I feel it's less likely to change, unless Scirra find ways that provide exact same results but with boosted performance (but imagine the risk behind that - In a rare case someone uses events that gives a different result, could break many projects).

    Or as I noted somewhere before - Say there's ways to boost these comparisons under the hood, but requires a tiny change to the base of the conditions (e.g. "Sprite - Compare Instance Variable" needs a tiny change to check if a "For Each" is running, or to check if only 1 object is picked, so that it can skip picking checks (assuming this is what is currently happening)), it could bring down performance "generally" now that all "Sprite - Compare Instance Variable" actions are now doing this quick check, which could add up quickly, especially when it's such a common condition used probably 1000s of times in a large project where this "quick check" may not be relevant since those events do need to fire in the current way (rendering that "quick check" useless in most cases, yet still has to do the check and use a tiny bit of CPU).

    With the above scenario, personally, I'd love to see more Actions/Conditons added to cater towards both scenarios. I wouldn't see it as bloat, I'd see it as more pathways to cooperate with how C3's runtime works. Whether it means a new "For Each (No SOL)" (as a random nonsensical example) or various new actions/conditions to be able to keep things readable (it'd look at lot less readable in events with many "evaluate expressions" with no Sprite icons and such), then I'd be all for that.

    Of course, if Scirra discover any performance boosts they could implement, I'd imagine they'd race to do that, especially if it's not a fundamental change, or if the boost is significant enough to risk breaking older projects then they may trial it in a beta.

    But I wouldn't fear much of this information suddenly changing overall. And if it did, it could only be for the better - But I see your point, should you sift through your project (or the particular systems within your project that you wish to get maximum efficiency out of), and start replacing things with "evaluate expression" where possible... or wait it out in case Scirra hatch an idea?...

    I suppose waiting for some insight would be ideal, there's plenty to work on elsewhere in a project! But if desperate for those CPU gains, can implement things in this thread. Even if Scirra made some alternative boost in an unexpected way (e.g. a new conditions/action/event block type), then we would migrate anyway if we seek those gains!

  • Thank you R0J0, I will study these results and your explanations deeply, it's very useful to myself and clearly many others.

    It's interesting the best performing result is a nested subevent - Whilst I haven't measured it exactly, I seem to have it in mind to "try to avoid nesting subevents where possible" (although I barely follow this since the gains of subevents almost always outweigh trying to lower subevent nests).

    Perhaps there's truth to trying to avoid too much nesting with subevents, but maybe in the case of a loop, less important since it's tackling the specific nest and subsequent nests? I presume Sub-Events function similarly to JSON nesting, where the more you nest, the more CPU usage needed to get the deeper nest (i.e. getting a JSON key at A.B.C.D.E.F is more expensive than getting the key from A.B, even if there's many keys present at A.B and just 2 keys present at A.B.C.D.E.F), plus event blocks have the added complexity with storing SOLs at various levels and reloading SOLs and such.

    ...More things to measure!

    Imagine having a 6,000-line project and needing to refactor a bunch of things that could have been done right from the start, just by realigning items.

    ... heh yeah, imagine! My project's at 16,000 events, no big deal!...

    ...

    ...send help... possibly send help to Je Fawk and a number of others too...

  • Another tip - Again, measure, but this one I know is true, and measured before writing this post, and have provided c3p to share.

    You can get a higher-performant For Each by using... Return functions! (Again MEASUREEEEE for your case!).

    Explaination

    Much like how you can do "Sprite n > 5, set Sprite opacity", and it applies to each Sprite without needing a "For Each", you can utilise this by using return functions, as they too run for each Sprite, without needing "For Each".

    I measured:

    1000 sprites, 500 have n = 0, 500 have n = 10

    Two test keyboard keys to hold:

    A Key = Sprite n > 5 , For Each , Pick Child , Set child opacity to 55

    S Key = Sprite n > 5 , Set Sprite dummyVar to Functions.DummyFunction(Sprite.UID)

    DummyFunction: Pick Sprite by UID , Pick Sprite Child , Set child opacity to 55

    Now, this seems wild - The "dummy return function" seems way more busy, and has to run a function, and pick the sprite via UID...

    ... But the results?:

    For Each - CPU 13% (Unlocked Frames: 900fps)

    Dummy Return - CPU 8% (Unlocked Frames: 1800fps)

    MAGIC.

    C3P below:

    drive.google.com/file/d/17CTtsNiO59uh5h8pjgY_4kTNPj6ooRQI/view

    (I would have included some clearer checks like a "Children updated" counter, but wanted it to be raw performance, but I quickly checked debugger for values being updated only for the last 500 children, and indeed it seems correct.)

    (Note: Be careful before rushing to implement this everywhere. Whilst I am using this in places and haven't had issue, I have a fear of "reaching call stack", which you may have seen when you run a function too many times in 1 tick - Perhaps this gets reached if over-using functions per tick, although I am not sure.)

  • Imagine having someone who really understands how things work under the hood, to explain it…

    But I guess the answer would be: "Don’t worry about that."

    Lol it be that way occassionally, but I find it goes that way when it's random fluff and filler. I find that Ashley often posts insightful stuff - Often I am Googling a random specific action, and encounter all sorts of insights throughout old forum posts. Sometimes it's interesting to Google something vague, like "Construct 3 Wait" or "Construct 3 Tween Performance", you find lots of valuable information.

    My slight fear of a reply would be "measure" or "would this affect real world scenarios", but I feel the difference here is:

    Measuring IS helping, but it's opening a lot of "why" questions - Why "this" order of events (especially with For Each sometimes being performant when higher-level). So long as we know why we are choosing to use "evaluate expression" and such, even if it's an odd quirk, then we can adapt our thinking to work with C3, and become powerhouses in optimisation.

    "Real World" I like to hope this thread is showing it, particularly with Je Fawk, hopefully my own case, too. I spent a lot of time to eek out performance where it is needed (and I avoid micro-optimisations like "Should I use Bool or Number for my one-off tiny return function" or something). I still hope to find new methods to optimise, and indeed this thread has taught me a lot.

    "For Each", whilst I understand a lot tend to avoid this, you sorta find that you need it for specific cases, thus end up using it a lot "at the end of a list of conditions", or just before the end - Such as: an event block evaluating a bunch of stuff for a Sprite, and there comes a point where you must pick this sprite's child in a hierarchy, so you must use For Each - Totally fine, since all this "For each" is doing, is picking a child (rather than expensively evaluating the entire event block).

    Thanks to this thread, I will be hunting down those "for each" blocks, seeing if I do indeed have some subevents with a quick "Sprite Inst Var = 5", and change this to "Evaluate Expression", measure, see how much improvement is gained.

  • > BTW fun fact, if you set opacity to a decimal number, I recall it eating CPU far more. Makes sense in some ways, C3 gotta convert it to int, but I recall it was a measurable different in cpu. Worth noting if you make your own opacity fading systems, better to wrap your opacity result with int() instead.

    Very interesting, thanks for sharing!

    You too quick to reply!! - I wrote my post, then went to measure just to fact-check and report back here, and I can't repro - I am certain I encountered this about a month or two ago, and kept flicking back between "int()" and not, and seeing a noticeable difference, hence why I posted this with confidence. I measured:

    1000 objects, setting opacity to decimal or not, seems to give same CPU either way. For each, or not... visible, or not... all same result.

    ... Either I'm going crazy, or it's a project-specific setting (WebGPU/Worker/Something), but hey, worth keeping in mind, if you ever wonder why your huge opacity system seems CPU-hungry; somehow this stuck to my mind after long experimenting/measuring a while ago. I'll update if I figure out what the exact scenario was in my main project.

  • Very insightful read, learned a lot! But also feel not 100% in tune.

    I suppose in some ways, it makes sense - when doing "Sprite inst var check", even if it's a For Each, it may be doing the typical Picking stuff, even though it's within a For Each loop and therefore only 1 is picked.

    Whereas "evaluate expression", doesn't do any picking, so using it during a For Each loop, it's taking away the task of picking, whilst still indeed cycling each sprite in order to get the correct variable for the evaluate.

    If this is true, perhaps C3 could be designed to take into account whether a For Each loop is active for an object type, therefore prevent the redundant picking and save some cpu... Although I wonder if that type of check could slightly worsen performance overall for any For Each loop, which wouldn't be ideal... Perhaps it's better to fully understand the exact mechanisms and work with it - but after a decade, never thought Evaluate would be extremely useful here.

    I seek to eek as much performance as possible in certain areas of my project, where the more performance, the more this system can be utilised (e.g. Event-based particle system, UI). Indeed a small 5% cpu gain could seem pointless in real-world scenarios, but that is a major difference for specific systems a dev might be trying to design.

    EDIT: Measured but this may be incorrect. BTW fun fact, if you set opacity to a decimal number, I recall it eating CPU far more. Makes sense in some ways, C3 gotta convert it to int, but I recall it was a measurable different in cpu. Worth noting if you make your own opacity fading systems, better to wrap your opacity result with int() instead.

Jase00's avatar

Jase00

Member since 5 Jan, 2012

Twitter
Jase00 has 12 followers

Trophy Case

  • 14-Year Club
  • Jupiter Mission Supports Gordon's mission to Jupiter
  • Forum Contributor Made 100 posts in the forums
  • Forum Patron Made 500 posts in the forums
  • Forum Hero Made 1,000 posts in the forums
  • Regular Visitor Visited Construct.net 7 days in a row
  • Steady Visitor Visited Construct.net 30 days in a row
  • Enduring Visitor Visited Construct.net 90 days in a row
  • Unrelenting Visitor Visited Construct.net 180 days in a row
  • Continuous Visitor Visited Construct.net 365 days in a row
  • RTFM Read the fabulous manual
  • x28
    Quick Draw First 5 people to up-vote a new Construct 3 release
  • x9
    Lightning Draw First person to up-vote a new Construct 3 release
  • x9
    Great Comment One of your comments gets 3 upvotes
  • Email Verified

Progress

28/44
How to earn trophies