R Operators: Arithmetic, Rational, Logical
Course DescriptionR operators are not only useful for doing calculations on data but can also be used to compare values or set up conditions for values.
What You'll Learn
> The 3 main types of operators: arithmetic, rational, and logical
If you haven’t installed R and Rstudio already, you can watch "Getting started with Python and R for Data Science" video to get started.
For the dataset used in this exercise, download from here.
If you want to do a quick calculation on some numeric values, such as calculating the difference between values or compare values to see if they match or meet a certain condition, then you’ll need to know the different operators you can work with. We’ll focus on three main types of operators: arithmetic, rational, and logical.
Let’s first look at arithmetic - you have your typical addition, subtraction, multiplication, division remainder, and exponent. What’s also important to know with these arithmetic operators is their order of operations. So, when calculating some values, it first calculates anything inside the parentheses followed by anything that has an exponent, then a multiplied number then division, addition, and subtraction. This is important because when you’re calculating the mean of some numbers - for example, you’re going to sum these numbers here then divide it by the number of numbers, you’ll see that this results in a different number than if we had to use parentheses if we had summed of these numbers first then divided.
As the default order is that division comes before addition, we want to tell the program to first calculate additions and then move on to division. This gives us the correctly calculated mean. So, rational and logical operators allow us to compare data values to see if they match, don’t match, are above, below, or equal to numeric thresholds, or extract data that meet a number of these conditions. Your rational operators include checking if a numeric value is greater than or less than a threshold is greater than or equal to or is less than or equal to.
A numeric value or a character string that is equal to or matches another value or is not equal to a value your logical operators include “and”, “or”, and “not”. You use “and” when you want to extract data that meets both one condition and the other condition or more. For example, it has to be both greater than this number and equal to this category. And “or” means that data will be extracted if it only meets one of these conditions or options that apply. For example, it can either be greater than this value or belong to this category. If it is greater than this value, then it will extract the data and will have no need to check any other condition as it’s already satisfied at least one. If it doesn’t meet the first condition, it will search the data based on the next condition and the next condition after that and so on and so forth until it meets at least one of the given conditions.
The "not" logical operator basically extracts out everything that is not one or more of the conditions. So, for example, I want to get everything that does not belong to this category. I’m interested in everything except for those things that are in this category. I’ll give you an example. We have some data here, which I ran into R, and we’ll cover reading data into R in another video dedicated to this, but we’re just using this to demonstrate operators. So, this data set looks at the average income across main U.S. cities across different job roles. Then, as a product manager, I want to know if San Francisco pays higher on average for my job role than where I’m currently living in New York City so I’ll show you how to use some operators to extract these data.
So, we’ll first extract New York City average income for product managers and store this in a variable called “nyc.product.managers” and we’re going to use our income data set here and inside this, we want to look for our city variable and have this equal to or match “New York City”. We’re going to use an end condition as well because we also want that to match product managers, so people who live in New York City and are product managers and we’re interested in the job title variable. We would like this to equal product managers or product manager. Okay, I’ll print this here. Awesome.
So, what we’re interested in here is the value under the average income here, this variable. Now, we need to also get the same for San Francisco, so we can compare them. We’ll just call this “sf.product.managers” and using our income data set. So, for our income data set, we’re interested in our “city” variable and we would like it to equal to “San Francisco” and we would also like it to equal to “job title” “product manager”. Okay, cool.
So, we have the average income for product managers in New York City and San Francisco. First, I want to know if it’s true that product managers living in San Francisco have a higher income on average than people in New York City. What I’m going to do is San Francisco product managers' average income greater than NYC product managers' average income? Okay, the results say this is true. So, basically, San Francisco product managers are paid higher on average, then I might consider relocating to this city but I’ll go a step further than that.
I want to know how much more San Francisco folks are paid on average in terms of a dollar figure so I’m looking at the difference between San Francisco product managers' average income. I’m gonna minus New York City’s product managers' average income. So, the difference is 7,000 that might or might not be a big enough difference for me to make the relocation worth it but it’s not bad either. Now you know how to use operators to extract useful data.
Next, we’ll cover how to read data into R.
Rebecca Merrett - Rebecca holds a bachelor’s degree of information and media from the University of Technology Sydney and a post graduate diploma in mathematics and statistics from the University of Southern Queensland. She has a background in technical writing for games dev and has written for tech publications.
© Copyright – Data Science Dojo