Click here for a transcript.
Similar to what we did with the bar plot, sometimes we want to add a third variable to our scatter plot. So I'm going to copy this down. And so, if that categorical variable, if that third variable is a category, we need to come into our AES, say color equals, and give it whatever variable. So, in this case, it's the type of drill used to create the well. And you'll notice that we're using color, whereas before we used to fill. And a good rule of thumb is that if it's a point or a line, you want to use color, but if it's a polygon, you want to use fill. So we'll use fill with bar charts or histograms, but we'll use color with line plots or scatter plots.
And so, if we run this because it's a categorical variable, we see this pre-populated legend that tells us what color each drill type is associated with. And you'll notice that we can't really see all of them, and that's because the plotting tool overlays dots on top of each other. So, we're actually hiding some of the dots with the H drill type and so if we wanted to make that a little bit easier to see we can use this argument called Alpha which sets the opacity of the dots.
And so, we can start to see a little bit more of those purple dots here, and we can see how we've sort of made these dots more see-through and that's why we see some of this purple coming through. But again, it's not necessarily the most helpful way to see the differences between drill types. And so, another way to show these differences is to add multiple panels, one for each drill type. So again, I've just copied and pasted our previous plot down, and we're just going to keep adding to it. So, if we do a plus sign and then another plot, in this case our plot is called facet wrap. And so, the facet wrap allows us to specify a variable that will create panels. So, within quotes, we do tilde variable name. And then, we add to this scales equals free, which tells it to fit itself instead of trying to specify how we scale each one.
And so, if we run this, we can now see that it's created a single plot for each of our drill types and still maintain a color, which helps. And we can even see how different the 95 percent confidence intervals are for our different data. The downside is that our labels are running into each other, and they're not necessarily matching. If you're interested, you can look into the documentation and figure out how to fix these, but as a demonstration, this is how you can show these different panels. So, for our final plot, we're going to come back up here and take our original basic scatter plot.
And so, until now, our third variable that we've been using has been a qualitative variable. But if you wanted to add a quantitative variable, the process is still the same. You still specify color within the AES, and then you give it whatever variable. Here, I'm going to use this perf interval length, which is a quantitative variable that they calculate in hydraulic fracturing.
And so, we can color it based off of that and go ahead and run through it. And you'll notice that the main difference here is that we now get a continuous legend, instead of individual categories. And so, we can see where you know the majority of these low lengths variables are and where some of the high ones are. You will notice there's some gray variables in here. This means that these particular data points, in here, they had values for max gas and total gas, but they had n/a values for the perf interval length. And so, it still plots those, it just doesn't give them a color. So you can still see all of the data, just colored for where they have it. And so, this is how you can add a third variable to your Scatter Plots, both continuous and as we showed earlier as individual categories.