Introduction
As it also exist in the previous article ‘How to Access or Index a Single Element of a DataFrame from a CSV File in Python‘, this article is also aim to retrieve elements from a DataFrame. What make this article is different with the one exist in ‘How to Access or Index a Single Element of a DataFrame from a CSV File in Python‘ is that this one is trying to retrieve multiple elements. The term element is referring to column. In ‘How to Access or Index a Single Element of a DataFrame from a CSV File in Python‘, it is success on retrieving single element or single column. But in this article, by modifying the syntax pattern a little bit, it will try to retrieve multiple elements or multiple columns.
How to Access or Index Multiple Element of a DataFrame from a CSV File in Python
As it also exist in ‘How to Access or Index a Single Element of a DataFrame from a CSV File in Python‘, below are the steps from the preparation until the execution for accessing or indexing from DataFrame to retrieve multiple elements :
-
As usual, it starts by running a command line interface which in this context, it is the Command Prompt since the process is using a local device running using Microsoft Windows operating system. As for the appearance, it exist as follows :
Microsoft Windows [Version 10.0.22000.978] (c) Microsoft Corporation. All rights reserved. C:\Users\Personal>
-
In the Command Prompt. type ‘python’ to get in to the Python command console as follows :
C:\Users\Personal>python Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>
Try to look at ‘How to Install Python in Microsoft Windows‘ and also ‘How to Install Python in Microsoft Windows 11‘ to get the information about how to install ‘python’. Specifically, in that article context it is in a device running using Microsoft Windows.
-
And in the Python command console, type the following for importing Pandas library :
C:\Users\Personal>python Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import pandas as pd
As a reference, just look at ‘How to Install Pandas‘ and also ‘How to Use Pandas‘ to check more info about how to use Pandas library.
-
After successfully importing Pandas library, use it to read CSV file as in the following command execution :
C:\Users\Personal>python Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import pandas as pd >>> df_nba = pd.read_csv('nba.csv')
Make sure that the ‘nba.csv’ file exist in the same root folder as the current working directory where the python command is being executed.
-
Given the existence of the DataFrame variable, execute describe() and info() method to get know more about the DataFrame variable as follows :
>>> df_nba.describe() Number Age Weight Salary count 457.000000 457.000000 457.000000 4.460000e+02 mean 17.678337 26.938731 221.522976 4.842684e+06 std 15.966090 4.404016 26.368343 5.229238e+06 min 0.000000 19.000000 161.000000 3.088800e+04 25% 5.000000 24.000000 200.000000 1.044792e+06 50% 13.000000 26.000000 220.000000 2.839073e+06 75% 25.000000 30.000000 240.000000 6.500000e+06 max 99.000000 40.000000 307.000000 2.500000e+07 >>> df_nba.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 458 entries, 0 to 457 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Name 457 non-null object 1 Team 457 non-null object 2 Number 457 non-null float64 3 Position 457 non-null object 4 Age 457 non-null float64 5 Height 457 non-null object 6 Weight 457 non-null float64 7 College 373 non-null object 8 Salary 446 non-null float64 dtypes: float64(4), object(5) memory usage: 32.3+ KB >>>
-
Finally, after retrieving all of the information about the DataFrame variable, execute the following command below. Especially the name of all the available columns, just execute the following command to access or to index the DataFrame to retrieve multiple elements.
>>> print(df_nba[['Name','Team','College']]); Name Team College 0 Avery Bradley Boston Celtics Texas 1 Jae Crowder Boston Celtics Marquette 2 John Holland Boston Celtics Boston University 3 R.J. Hunter Boston Celtics Georgia State 4 Jonas Jerebko Boston Celtics NaN .. ... ... ... 453 Shelvin Mack Utah Jazz Butler 454 Raul Neto Utah Jazz NaN 455 Tibor Pleiss Utah Jazz NaN 456 Jeff Withey Utah Jazz Kansas 457 NaN NaN NaN [458 rows x 3 columns] >>>
So, instead of using single bracket with only one string value as the key to retrieve only one column, just use the bracket [[]] form to state that there are more columns to be retrieved. In the above example, it is trying to retrieve the ‘Name’, ‘Team’ and also the ‘College’ column from the DataFrame variable.