Pandas loc() and iloc() functions
A Python Library is a collection of related modules usually referred to as files, programs, routines, scripts or functions. Python supports a numerous libraries and one of my favorite and the most powerful library is Pandas.
I had a use case at work, where I had to find the sum of multiple columns to give me a new column as “Total”. Here is the sample data set , which has a category and the scoring given out by consumers from 5 locations.
loc and iloc Functions
- loc and iloc are used to slice the data frame by referring to a certain row, column and using a specific key word to filter a column.
Let’s say we want to filter the above data set on the column “Category” being “A” , then you can use loc function to access a row based on a label.
import pandas as pd#Example using loc() functiondf.loc[df['Category']=="A"]
For the same data set if we want to filter the first few rows , then we use iloc function which is purely integer-location based indexing for selection by position which is read as df.iloc (R,C).
#Example using iloc() functionprint(df.iloc[0:4])
Using df.head () this could be achieved too but iloc has more advantages as you can control the granularity of the data to retrieve.
Coming back to the initial use case of calculating the totals, I could do this by using the column names and summing all of them together. Something like this.
df['Total']= df['Trail 1']+ df['Trail 2']+ df['Trail 3']
+df['Trail 4']+ df['Trail 5']
But now that we know how to use loc and iloc we can modify the code and save sometime in removing the trailing spaces from the column headers or typing in each of the column header manually.
df['Total']=df.iloc[:,2:7].sum(axis=1)
In the above line, you are asking python to read all the rows and columns from 2 to 7 being exclusive to return the totals of each row and store it by creating a column Total.