How to monitor your code for performance and error handling in Wix Studio
Loading...
Learn how to use the Monitoring Dashboard to effectively monitor your backend code on Wix Studio for optimal performance and error handling.
In this video you will learn how to use the Monitoring Dashboard on Wix Studio to effectively monitor your backend code for errors and performance.
Transcript
All right, welcome to another video where we'll be taking a deep dive into the technical functionality of Wix. Today, we're going over backend monitoring, which lets us look at the performance of our site, what users are doing on it, what's working, and what's not. We're going to go over a few different things. First, we'll look at backend requests and data request tabs, and the various charts and tables on them. Then, we'll dive into how we can use these charts to identify and fix common issues on our sites.
So, let's get started! As you can see on my screen right here, I have a site that's not doing so well. That's intentional. I made a bunch of functions on this website that essentially just don't work properly. Some do, but a lot don't. First, let's take a look through our monitoring dashboard, talk about what each element of it is, and then after we're done doing that, we'll go into actually debugging some of the issues that we see here and finding the root causes of that. If we navigate to our monitoring dashboard, which you can find under developer tools > monitoring, you'll see at the top, there is a backend request tab, a data request tab, and a CMS collection storage tab, which is information on how much data is being stored in your collection. But we're not going to be talking about that too much today. Instead, we're going to focus on the backend requests and the data requests. At the top, you'll see that we also have a tool tip that says, "This site may not be working correctly," and it gives some additional resources that you can go and check out.
We can also see the total requests made to our site in the last 30 days and the number of failed requests. So, here we see a lot of successful requests, but we also see quite a few failures and get a nice little big red icon here to draw our attention to it. Now, if we scroll down a bit, we get into our backend request details. On the left, we have a chart that lists all backend requests. If we look at backend request duration, this little chart right here, we can actually see some very interesting things already about issues that might have happened with our site. We have two different lines, this kind of light blue and this darker blue that represent the 50th percentile duration (this is kind of like your middle of the distribution function calls when those are made, how long are they taking) and this is the 95th percentile so those function calls that are at the higher end of total time to execute, how long are they taking. When we look at the middle of the road, they take about a second, that's expected, that's performing pretty well, seeing around 8 to 900 milliseconds here along this line. But up here at the top, blue lighter line, we're seeing that of the function calls that take the longest, they're all taking 14,000 milliseconds. So, this gives us a bit of information here that, hey, a lot of your requests are working fine, but there are outlier requests that are not working at all.
14,000 milliseconds is also a very important number because Wix's backend requests are capped at 14c durations. So basically, without even looking any further, without looking at the chart below, which we're going to do in a second, we can see that our requests are simply taking too long and are completely timing out and being dropped on the backend. Let's move on down then and check out our various functions. So, I've written my functions over here. You can see them. They're all pretty typical things that you might do. These are all HTTP functions, so we're basically exposing an API for our server to talk to our client. So, our client is occasionally making requests, and our server is responding to them. And so, the code itself isn't too important for most of the explanation here, but just want you to be aware of this is how this is structured and that monitoring can be used anywhere, it can be used in HTTP functions, it can be used in regular backend functions, vel modules, whatever is going on in the backend, this interface will capture it. And then, if we go scroll down, what we can see are our backend functions. Basically, every function that's being called is going to end up listed here, and you'll get the names of each function and where to find them. These are all HTTP functions, for example, and this is what they're called. Then you'll have a number of columns. You'll have the percent of total request that these functions make up, kind of essentially how important they are. Sometimes, you might think you've written a function that's going to get a lot of requests and it gets very few, or vice versa, you might think it's only going to get a very, very few requests and gets a lot, just kind of gives you an idea of what's being used on your back ends. Then we'll have the number of total requests for the given time frame. Right now, we're looking at the last 30 days, and then we'll show requests that are throttled. Basically, that means that on the Wix's backend, there is a limit of the total number of requests.
That can be made to your backend in a given minute, and if that gets exceeded, then those requests are going to fail and be throttled. So that's something very good to be aware of. So if you ever have a particularly popular wig site, you might end up seeing that. And then there are steps you can take to mitigate it. There's also the timed out column, which is going to tell you function calls that ran out of the 14-second limitation that I mentioned, and other errors. These are requests that failed because of a code error or a temporary failure in infrastructure and other technical issues.
And lastly, you're going to see the duration of the 95th percentile. That's equivalent to what you have up here. And that's essentially when these functions are called. These are the times that are the worst times possible. So if you have a function that normally completes in a second but sometimes completes in 5 seconds, you're going to see 5,000 milliseconds in the duration for the 95th percentile here. So that's that, and we'll come back to it. But now let's go take a look at our data request tab. You'll notice that our data request tab is very similar. We have the same tool tip at the top. We get to see this whole number of requests made to our actual data collections over the past 30 days. The number that fa we get our time period, we get a variety of filtering options. We can filter by a particular collection if we're interested in it, we can filter by certain operations. So if inserts are causing you trouble or aggregations are you can see that. And also filter by error code. If you're encountering heavy amounts of errors in a specific error code, you can filter down and see just specifically those errors and what they're affecting. Then at the top, we get a chart that shows all of our data requests: the numbers that are successful or had an error or timed out. Then we have a very useful chart on the left here, which is essentially all of the different error types. So these are error codes that when you click on them will take you to the Wix data error codes page, and you can scroll through this page, look up your error, and get a really good idea of what is called in the error so that you can go and solve it in your code. So now we'll head back over to our monitoring tab, and we'll notice here too, we also get a data request duration, so we can see kind of the divergence between our happy, well-performing data requests and some less performing data requests. Though everything's performing pretty well here on the total time for data request, and then much like on the other tab, we have our top data requests except these are broken out by collection. So you can see the collection name and ID, you can see the operation that was performed, the percent of requests, again this gives you a great idea of like what's actually happening on your back end, total number, and then throttling, time out, other errors, and duration, very similar to what you saw on the other tab. And so now let's take a deeper look at our backend request tab, and we'll be diving into specific functions and how they're working. If we look at our list here, we see we have a popular request and insert cookies that are both performing pretty well. There's a lot of requests and relatively very few errors. Errors are almost impossible to always avoid, so it's always good to have good error recovery in your functions. This is something that's beyond the scope of this video though, but perhaps something we can talk about in a future video. But if we go ahead, we can look at our get popular requests, just a very simple request, it's basically just getting a request and returning true. So it doesn't really do much, just here for a nice little demonstration. Then we have what could be considered, for this website, the most important function in the website that is also performing well when we look at our backend requests. So this server here just stores the names of cookies, takes a Quant, and it says whether or not they are gluten-free. That's all we're really doing here, and we're writing it to our cookies database. So that function that does very little is running quite well, and that's kind of what you would expect. However, we do see that we have some issues. So here, for example, get total cookies has 315 requests and 314 errors almost every single time this function is called, it seems to fail. So that's great to know, you know, sometimes things can sort of fail silently. You may have client code, for example, that calls a function or calls a remote endpoint; in this case, and the client doesn't do a good job of actually tracking whether or not an error has happened, so that's a pretty common thing to encounter, so it's great that we have it here because we can always, if something ends up going wrong, we can take a dive into what's gone wrong. And so if we head over to our get total cookies, let's check it out, we can see that, yeah, we've written it and response. Body, there's a little hint here, is that actually result. Item is not a property item is not a property on on this result it's actually item.
And so that's probably why it was failing, so we can just go ahead and fix that right there. And then, of course, we would publish it, but we're not going to do that right now. Uh, next, we see that oh, we have this other function, get long request. I've labeled these pretty obviously as to what they do, and we see right here, get long requests is a function that is expected to always take about 30 seconds (30,000 milliseconds, like I mentioned). That's beyond the 14c time limit for backends requests, and so get long requests is always going to fail here. And that's what we're seeing in our back end. So one thing that we might do (if this was a real function) is we might split it up; we might see what parts of the function we can take and split into maybe two functions or three, and do it in a way that no particular part of the computation takes more than 30 seconds. This is also pretty rare; in general, you shouldn't be doing stuff that takes too long on the back end to execute. If you are, that's potentially also a hint that your code is doing too much in a single function, and you really things should be isolated so that they're only doing the one thing that they're supposed to do, and everything else that might need to be done in a workload can be written into other functions. So that's just a general thing to keep in mind whenever you're programming – these things should be kept separate, and that will help you also avoid this type of timeout. And so if we come back over to our backend functions, we see that we also have list cookies here. That's performing pretty well, though it is taking a little bit to list out all the cookies. So if we go head over here to our code for this cookies, we see that we're essentially querying it, we're taking the top five of them, and then returning that response. So we're returning five cookies from our list of cookies. This right here, maybe it's something we can solve in code, but maybe it's something that we can solve in other ways – for example, if we were to add an index to the cookies collection, we might see a performance gain. Moving down the list, we see an interesting one here which says insert cookie no o. And if we head over to insert cookie no o, we see something very interesting – we are inserting cookies just like we did in our 'insert cookie' function, but the 'no off' one lacks this 'suppress off' declaration. And that's interesting because what's happening here is that we're not getting an error when we should because given the properties of our cookies collection, we need to be an administrator in order to add to it, but here we're not an administrator and we're trying to add to it, and we are getting an error, but we're not seeing the error in our backend monitoring. And that's because we've done a try and catch, and this is actually good practice – we're returning the result to our client saying that's a bad request, but that's an accident. This is not a bad request from the client; the client's request might very well be properly formed; in fact, I know for a fact that it is. Instead, what's happening here is that our insert doesn't have administrator privileges, but that's not exposed here, so we don't see that here. All we see here on our backends request details is we see 193 requests, one error. Everything looks like it's pretty much working fine, but we can dive into that a little deeper. So now if we look at our data requests and we come on down to look at our, let's see, where it is – inserts... Oh, that's interesting. We have an insert operation, a lot of requests; a lot of them succeed, but 190 is errors here too. That gives us a bit of a hint; we're getting about the same amount of errors on this page as we are successful requests on the other one, and I wonder what those errors could be. So we see here we have a few – wde 27 and 25 – those seem interesting to me, so let's go check those out on the error codes page. So let's look at 25. Uh, 25 is "that your collection does not exist, you cannot work in a collection using the data API before it's created in the editor." Okay, that makes sense, but if I come over here, well, my collection is called cookies, and if I expand my code panel and I look at my collections, I have a collection called cookies there; it's typed correctly, everything is right there, so that's not the error. So if we come back here, we also see that there's an error 27. So if we look at this one, the current user does not have permissions to su action on your collection, and there it is, that's the clue, that's the hint. We can go in and say, "Oh look at that, I need to suppress off here." Now, if we look at our next function, we're going to see something similar to what we saw with the insert cookie no o. And here, we have 180 center requests with only one error, but that's a little weird because if we go over to our invalid insert, we see that we're actually trying to insert into a non-existent collection. So if we come back to data requests, and you're probably recognizing this from the other error code that we looked up – that we were seeing. In fact, let's just skip ahead to it.
The data collection does not exist. And of course, I can go and use that in use that little bit of knowledge head over here, open up my Code panel, check out my database, and I know that that collection does not exist. Cuz it's not here in my Collections. And then let's take a look at our LS function get product list, and we can see, too, that there's nothing really going on here. Seven requests, they're all succeeding, and it's also showing how this function here is very, is used very, very little. So that's also something to keep in mind too when it comes to maintaining your code base. Is, you know, if a request is used almost not at all, should you be spending time and Ming it? Is it the right feature or function? These are things to think about, uh, because monitoring isn't just about your performance here, but it also gives you deeper introspection into your code and how it works and how it's being used. So these are just things to consider, perhaps we don't need that function. All right, so that's all there is to it. If you haven't looked at your moderating dashboard yet, I'd recommend that you go check it out. You never know what you'll find there. And remember, three of the most common issues that you're going to encounter are throttling, timeouts, and errors. For throttling, if your code is getting throttled, then try to reduce the number of requests needed to run your website and batch those requests if possible. For timeouts, do the opposite, split up the work into multiple calls, make sure that each endpoint has just one concern and isn't doing things that are better done in a separate endpoint. And for errors, you'll need to debug your code, use side events, Google operations, and the functional testing interface. We'll talk more about some of those in the next video on backend code issues when we cover logging. So stay tuned for that and thanks for watching.
EXPLORE MORE CONTENT
What do you think about the tutorial?
More creation-fueling resources