Processing Power Transparrency?

Onshape is my favorite CAD I've ever used (including Rhino, Fusion, Solidworks). There's a fear I can't shake though – which I admit may be unfounded, but I feel it anyway. I worry that sometime in the future something will happen to cause Onshape's performance to drop below what I need professionally and leave me and my clients in a bad spot. It could be any number of reasons like:

exponential growth that outpaces server expansion

a financially down year or two or some larger internal company upset that causes servers to be used for other things or sold.

I'd love to have some kind of confirmation that, yes, I'm getting a minimum of X amount of processing power assigned to me, almost like the speed test for internet speed. That way if I do something silly, like pattern 100 loft features, and it takes forever to compute, I won't be wondering whether Onshape isn't giving me enough juice. I'll know to blame myself. Any way to do that?

Note, I'm not talking about the regen time tool. I know that's there, but it doesn't tell me anything about how much computing power I have access to at any given time.

I imagine that all users share a human need for a sense of control and stability, which is one of the biggest psychological challenges that cloud CAD faces vs traditional software. I think transparency about processing power could address some of that for me. Is there already a way to see that stuff? If not, what might it look like to add it?

Do other users think about this stuff? I bring this up mostly to start a conversation about it.

Best Answer

At a high level, Owen is correct: we're dynamically allocating Amazon servers (in various roles -- see https://www.onshape.com/cad-blog/under-the-hood-how-does-onshape-really-work) to handle the load and we have plenty of room to scale. Amazon has enough hardware to handle the load if every CAD user in the world switched to Onshape (we have work to do to scale to that capacity, but hardware availability won't be a constraint).Also, poor performance is one of the fastest ways to lose customers, so Owen's hypothetical accountant wouldn't keep their job here very long.

Going to a little more detail, what determines Onshape's performance is not what you might expect. If we could pay Amazon 2x the money and have everyone's performance double, I expect we'd do that (we're investing a lot of expensive developer time in performance). The performance of tasks like rendering, FEA, and general number crunching scales roughly linearly with the amount of hardware thrown at it, but CAD is very different. For example, the state of your and our caches is more likely to affect performance than the CPUs allocated to a user. Here's a very rough and incomplete breakdown of how some things affect Onshape performance:

User's graphics card (pro cards are much more expensive but aren't any better than gaming cards) - how smoothly a model rotates, how responsive selection and hover effects in the model area are

User's CPU - if the browser itself is freezing up (e.g. hovering over toolbar buttons doesn't produce a hover effect) that might be the cause

User's RAM - if too low, can limit part studio and assembly complexity

User's latency to our nearest AWS region - responsiveness of most aspects of the system that are not entirely client-side (so not view manipulation), including modeling, assembly drag, measure, drawings

User's bandwidth to our nearest AWS region - document loading time

State of caches in user's browser - login page loading time, document loading time

State of document-related caches on our servers - everything, but especially anything that might require us to regenerate all or part of a part studio, such as loading a part studio, assembly, or drawing, just about any document editing, etc. After we deploy a major update, we often have to purge some of our caches leading to slower loading times the first time you open a document after an upgrade. We're working out ways to avoid purging those caches, but that is still some time away.

Various internal network latencies and database cache states - documents page loading time and search speed, loading time when we don't have data in a document-related cache

Raw speed of our geometry servers - this gets hit most when we don't have a cache for a regenerated part studio, but also affects modeling responsiveness and assembly drag smoothness

To summarize, there are very many variables affecting how long any particular operation takes. As a result, any guarantee of the type "you get X processing power" is meaningless. For example if I were an Onshape customer, I'd take smarter caches over more raw horsepower any day (as a developer, smarter cache usage is one of the things I and others are working on).

Hopefully the above provides a little more insight into the factors that go into Onshape's performance. Rest assured that we're dead set on improving performance and increasing the size of the documents that Onshape handles comfortably and that is not going to change in the foreseeable future. Not giving users "enough" juice makes no sense -- it would just go against all our performance efforts.

Ilya Baran \ Director of FeatureScript \ Onshape Inc

10

Answers

As far as I'm aware OS runs on leased Amazon servers not OS servers, so scaling up is just a matter of paying Amazon more money not purchasing and maintaining hardware internally. I'd imagine this is dynamically managed so as load increases more resources are assigned. As long as an accountant doesn't cheap out in the future we should be fine.

Given how approachable OS staff are, and how seriously they respond to even minor issues I have a lot of faith in them. Besides If a bunch of us had problems and vented about it on the forum then it would be bad publicity.

I haven't had an outage in the last 18 months, I'm convinced and happy.

At a high level, Owen is correct: we're dynamically allocating Amazon servers (in various roles -- see https://www.onshape.com/cad-blog/under-the-hood-how-does-onshape-really-work) to handle the load and we have plenty of room to scale. Amazon has enough hardware to handle the load if every CAD user in the world switched to Onshape (we have work to do to scale to that capacity, but hardware availability won't be a constraint).Also, poor performance is one of the fastest ways to lose customers, so Owen's hypothetical accountant wouldn't keep their job here very long.

Going to a little more detail, what determines Onshape's performance is not what you might expect. If we could pay Amazon 2x the money and have everyone's performance double, I expect we'd do that (we're investing a lot of expensive developer time in performance). The performance of tasks like rendering, FEA, and general number crunching scales roughly linearly with the amount of hardware thrown at it, but CAD is very different. For example, the state of your and our caches is more likely to affect performance than the CPUs allocated to a user. Here's a very rough and incomplete breakdown of how some things affect Onshape performance:

User's graphics card (pro cards are much more expensive but aren't any better than gaming cards) - how smoothly a model rotates, how responsive selection and hover effects in the model area are

User's CPU - if the browser itself is freezing up (e.g. hovering over toolbar buttons doesn't produce a hover effect) that might be the cause

User's RAM - if too low, can limit part studio and assembly complexity

User's latency to our nearest AWS region - responsiveness of most aspects of the system that are not entirely client-side (so not view manipulation), including modeling, assembly drag, measure, drawings

User's bandwidth to our nearest AWS region - document loading time

State of caches in user's browser - login page loading time, document loading time

State of document-related caches on our servers - everything, but especially anything that might require us to regenerate all or part of a part studio, such as loading a part studio, assembly, or drawing, just about any document editing, etc. After we deploy a major update, we often have to purge some of our caches leading to slower loading times the first time you open a document after an upgrade. We're working out ways to avoid purging those caches, but that is still some time away.

Various internal network latencies and database cache states - documents page loading time and search speed, loading time when we don't have data in a document-related cache

Raw speed of our geometry servers - this gets hit most when we don't have a cache for a regenerated part studio, but also affects modeling responsiveness and assembly drag smoothness

To summarize, there are very many variables affecting how long any particular operation takes. As a result, any guarantee of the type "you get X processing power" is meaningless. For example if I were an Onshape customer, I'd take smarter caches over more raw horsepower any day (as a developer, smarter cache usage is one of the things I and others are working on).

Hopefully the above provides a little more insight into the factors that go into Onshape's performance. Rest assured that we're dead set on improving performance and increasing the size of the documents that Onshape handles comfortably and that is not going to change in the foreseeable future. Not giving users "enough" juice makes no sense -- it would just go against all our performance efforts.

With regard to the slower initial load of a document after an update it would be great if you could indicate to the user that this was happening. Say a different spinner icon and a friendly message stating that OS has been improved and the next load will be faster.

I think that would flip the bit in a user's mind from the helpless frustration of watching a spinner, to happinesses that not only have more toys have been released, but also that the next load of the doc will be quicker.

We have two pet hates in our design office:-

(1) Spinners with no indication of whether they'll stop spinning in 2 seconds, 2 minutes or not at all.

I loved the improvement you guys made to show a simplified render of the model that is refined when fully loaded/rebuilt, but more info would be welcome whilst waiting. Just something to show there is progress taking place and that the browser hasn't hung.

(2) Unexpected reboots, software shutdowns, or nonsensical outputs. These are known locally as "Microsoft Moments" and result in much bad language .

Onshape has been developed to be wonderfully immune to such things it seems.

Thanks to @owen_sparks and @ilya_baran for such detailed replies (within a few hours on a Saturday no less!). A few scattered thoughts:

It makes total sense how the concept of "X processing power" could be meaningless, especially after Ilya's detailed rundown. Still blows my mind that I can get feedback (or trivial Featurescripting help) from an actual Onshape developer so easily through the forum. Thanks for that!

I still think that Onshape is going to keep on having to find ways to improve the user's expectations and perception of how Onshape is performing. For most users – especially if Onshape gets the widespread use it deserves – Onshape will always be somewhat of a black box, so finding ways to inspire trust are always going to be critical.

Part of the rub here is that coming from a Solidworks there's a mindset of needing to buy a "CAD Workhorse" computer to get fast performance. Yes, it's inconvenient, expensive, and obsolesces quickly, but at least I would have the feeling of control. The choice of how much power is up to me. I think users feel like they are giving up some control of their CAD fate when they switch to full cloud CAD.

Even though "X processing power" doesn't mean much it could still be good to see some kind of "performance index". If real performance is a result of a ton of different dynamic things, maybe there could be some way to communicate overall performance succinctly even at some loss of detail and accuracy.

I also like Owen's idea of showing some kind of progress on the spinner. I think if it usually goes more than 10-15 seconds I find myself mousing to the refresh button just to feel like I'm making a difference.

I have a ton of admiration for the Onshape UX/UI team and think implementation has been exceptionally thoughtful so far (like the configurations implementation, amirite?), so I'm sure my suggestion should not be exactly followed. I just hope I can help make the product a bit better by explaining a bit of raw feeling I've had about it, even though it may not be 100% logical. I know it is common to get too good at using a tool to the point that it's hard to remember what it feels like to a layperson. Others probably feel similar things, so it matters.

The Onshape team has definitely already demonstrated a commitment to constant improvement, and I'm really excited to be along for the ride. Really, anything that gives the user feedback about what's happening in The Black Box is helpful, even if it's not perfect data.

Even though "X processing power" doesn't mean much it could still be good to see some kind of "performance index". If real performance is a result of a ton of different dynamic things, maybe there could be some way to communicate overall performance succinctly even at some loss of detail and accuracy.

I think this is a great idea! Ideally, we should never need to look at a performance monitor and generally this is the case for me but eventually, I do hit the limits on some documents and here I'd love something that I could look up to let me know where the problem is and give some suggested remedies. I am thinking some kind of performance dashboard that would be run similar to a net speed test (which would, of course, be 1 part of the dashboard).

One of the ideas we've been tossing around is some sort of diagnostic tool that measures document complexity and informs you of where various parameters (things like number of tabs, parts, part complexity, regen time, assembly size, tessellation complexity, derived chains, configs, etc.) are on the scale of "Onshape should handle this comfortably" to "It's going to slow down to a crawl and a different design approach may be in order".

One of the ideas we've been tossing around is some sort of diagnostic tool that measures document complexity and informs you of where various parameters (things like number of tabs, parts, part complexity, regen time, assembly size, tessellation complexity, derived chains, configs, etc.) are on the scale of "Onshape should handle this comfortably" to "It's going to slow down to a crawl and a different design approach may be in order".

Something like this would be pretty cool for me. I'm always experimenting with different workflows.

One of the ideas we've been tossing around is some sort of diagnostic tool that measures document complexity and informs you of where various parameters (things like number of tabs, parts, part complexity, regen time, assembly size, tessellation complexity, derived chains, configs, etc.) are on the scale of "Onshape should handle this comfortably" to "It's going to slow down to a crawl and a different design approach may be in order".

One of the ideas we've been tossing around is some sort of diagnostic tool that measures document complexity and informs you of where various parameters (things like number of tabs, parts, part complexity, regen time, assembly size, tessellation complexity, derived chains, configs, etc.) are on the scale of "Onshape should handle this comfortably" to "It's going to slow down to a crawl and a different design approach may be in order".

That sounds great. I'd take no offense if OS occasional told me it was time to take a different approach.

@ilya_baran (or other Onshape people),Can you explain how the number of tabs in a document affects the performance? I understand why part studio complexity and other stuff can cause issues, but not the number of tabs. Is it because of the tabs being cached in memory, or is there some other behind the scenes computation going on?I have noticed a performance difference in larger documents and I'm just trying to understand things for reference when working on new projects.

Performance is a good discussion to get more in detail about and I'm glad that Onshape is taking it seriously. I'm starting to see the "Spinning Wheel" too often on a seemingly simply Doc and I know they're only going to grow in complexity. Also, a collegue who has started to utilize configurations in combination with drawings is seeing a noticeable difference; it's troublesome to work with indeed.

Another example to question is Derived Parts/Sketches. I've noticed looking at the Regen Times that one went from 967ms to 74ms when I checked it again later (nothing changed to my recollection...) Not sure if "cached" data is part of the equation or not?

When my document slow and I can not maintain my desired Onshape work speed, I want to know why. I wonder if we could have some sort of dashboard like in this picture to let us know where the problems may be so we can go about fixing the issue to get the working speed back.

Ilya Baran wrote: "User's latency to our nearest AWS region - responsiveness of most
aspects of the system that are not entirely client-side (so not view
manipulation), including modeling, assembly drag, measure, drawings".

I suspect this is a big one. I use Comcast Cable for Internet because it's the fastest available in my area but bandwidth availability is terribly inconsistent. It's so bad that streamed audio is impossible with constant pauses for buffer refill. If I complain, they go through the motions but nothing changes. For them, bandwidth is about downloading movies - smooth, consistent, real-time access is an alien idea to them.

If the browser seems to momentarily freezes, my best guess is the local internet pipe has temporarily shut down. I've been told cable companies allocate bandwidth with a time-division method so users get a lot - then nothing.

FWIW, I'm relocating to a small town in another state where symmetrical 600MB fiber to the desktop is available for less than I'm paying Comcast. I'm told it's great. I'll get to see how that works out.