How to Select Row of Several Columns with loc function from a DataFrame using Pandas Library in Jupyter Notebook

Posted on

Introduction

This article will inform and give an alternative to be able to select row of several columns from a DataFrame variable. In order to achieve it, the DataFrame variable has a loc function available. The DataFrame variable will store data. The data can be from any kind of sources. It can be from a CSV file, Excel file, database connection, etc. The execution of the purpose is simple. In this article, the execution is in an open source web-based application called Jupyter Notebook. So, the first step is just to execute the Jupyter Notebook as follows :

(myenv) C:\python\data-science>jupyter notebook
[I 17:08:28.062 NotebookApp] Serving notebooks from local directory: C:\python\data-science
[I 17:08:28.062 NotebookApp] The Jupyter Notebook is running at:
[I 17:08:28.067 NotebookApp] http://localhost:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b
[I 17:08:28.068 NotebookApp]  or http://127.0.0.1:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b
[I 17:08:28.070 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 17:08:28.225 NotebookApp]
    To access the notebook, open this file in a browser:
        file:///C:/Users/Personal/AppData/Roaming/jupyter/runtime/nbserver-8796-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b
     or http://127.0.0.1:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b

Select Row of Several Columns with loc function from a DataFrame

After executing the application, start typing the script in the Jupyter Notebook. The following is the script to be able to import the data. There is a sample in the article How to Read CSV File into a DataFrame using Pandas Library in Jupyter Notebook and How to Get Data From a PostgreSQL Database in Jupyter Notebook. It is available in a page called Data Science. Generally, the script exist as follows :

import pandas as pd
df = pandas.read_csv("file_csv.csv", index_col="label_name")

Following the above script, start selecting the rows with several columns. But before that, list all the available columns as follows :

How to Select Row of Several Columns with loc function from a DataFrame using Pandas Library in Jupyter Notebook
How to Select Row of Several Columns with loc function from a DataFrame using Pandas Library in Jupyter Notebook

After loading the data into a file above, the selection process is possible. The following is the example of selecting a row with every columns. Selecting the row is possible by passing value defined in the index_col. The value is ‘Name’ which is one of the column in the CSV file.

How to Select Row of Several Columns with loc function from a DataFrame using Pandas Library in Jupyter Notebook

The example above is using a CSV file text consisting of several NBA Players data. Before displaying several columns from row, just check the available column by typing ‘df.info()’. But in the above output image, searching row based on the value of ‘Raul Neto’, the available columns are visible. So, the above image is displaying how to select row with all columns. Furthermore, in order to show several columns, just select the column label as follows :

How to Select Row of Several Columns with loc function from a DataFrame using Pandas Library in Jupyter Notebook

So, the script above is in the following pattern :

df.loc['Avery Bradley',['Number','Position','Height']]

The script above means just select the row which has an index value of ‘Avery Bradley’. The row label has already defined before which is ‘Names’. The rest of the argument of the ‘loc’ function is specifying the column name. Those columns name are ‘Number’,’Position’ and ‘Height’.

Leave a Reply