Introduction
This article will show how to select all columns for the first row in a variable with the type of DataFrame. The process for demonstrating how to do it will be running a script in Jupyter Notebook. The web-based application called Jupyter Notebook is quite handy for creating and sharing documents that contain live code, equations, visualizations and narrative text. It also include data cleaning and transformation, numerical simulation, statistical modeling, data visualizations, machine learning and many other. Just use any DataFrame variable with the content of data from any kind of sources. Either CSV file, database connection or any other data sources. First of all, run the Jupyter Notebook as follows :
(myenv) C:\python\data-science>jupyter notebook [I 17:08:28.062 NotebookApp] Serving notebooks from local directory: C:\python\data-science [I 17:08:28.062 NotebookApp] The Jupyter Notebook is running at: [I 17:08:28.067 NotebookApp] http://localhost:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b [I 17:08:28.068 NotebookApp] or http://127.0.0.1:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b [I 17:08:28.070 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). [C 17:08:28.225 NotebookApp] To access the notebook, open this file in a browser: file:///C:/Users/Personal/AppData/Roaming/jupyter/runtime/nbserver-8796-open.html Or copy and paste one of these URLs: http://localhost:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b or http://127.0.0.1:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b
Selecting Row using loc function from a DataFrame using Pandas Library in Jupyter Notebook
After running the Jupyter Notebook, start to run the script for selecting data from the DataFrame type variable. The script for selecting data from the DataFrame is possible using ‘loc’ or ‘iloc’ function. It is available as functions exist in the DataFrame type variable. The following is the script to select using ‘loc’. The ‘loc’ function is using row index location and label-based column location for selection. So, in order to be able to perform selection with the function, retrieve data first using a selected index column. In this context, the column name is “Name”. It is one of the column available from the CSV file or column from query execution on a database.
import pandas as pd data = pd.read_csv("file_csv.csv", index_col="column_name") data.loc[row_selection,column_selection]
After running the Jupyter Notebook, start to run the script for selecting data from the DataFrame type variable. The following is the script :
import pandas as pd data = pd.read_csv("file_csv.csv", index_col="Name") data
The above script execution exist in the following image :
So, the row in the output of the above script execution will have an index label of “Name”. If the variable df which has a DataFrame type pull the data using a “Name” label index, the searching mechanism can also use the ‘loc’ function as follows :
data.loc["Name"]
For an example, by modifying the above script, the row selection is possible as in the following script below :
So, if the value for executing the loc function selecting specific row matches the value in one of the value available in the “Name” label index, it will display all of the columns or attribute of the row.