How to Access or Index Multiple Element of a DataFrame from a CSV File in Python

Posted on

Introduction

As it also exist in the previous article ‘How to Access or Index a Single Element of a DataFrame from a CSV File in Python‘, this article is also aim to retrieve elements from a DataFrame. What make this article is different with the one exist in ‘How to Access or Index a Single Element of a DataFrame from a CSV File in Python‘ is that this one is trying to retrieve multiple elements. The term element is referring to column. In ‘How to Access or Index a Single Element of a DataFrame from a CSV File in Python‘, it is success on retrieving single element or single column. But in this article, by modifying the syntax pattern a little bit, it will try to retrieve multiple elements or multiple columns.

How to Access or Index Multiple Element of a DataFrame from a CSV File in Python

As it also exist in ‘How to Access or Index a Single Element of a DataFrame from a CSV File in Python‘, below are the steps from the preparation until the execution for accessing or indexing from DataFrame to retrieve multiple elements :

  1. As usual, it starts by running a command line interface which in this context, it is the Command Prompt since the process is using a local device running using Microsoft Windows operating system. As for the appearance, it exist as follows :

    Microsoft Windows [Version 10.0.22000.978]
    (c) Microsoft Corporation. All rights reserved.
    
    C:\Users\Personal>
    
  2. In the Command Prompt. type ‘python’ to get in to the Python command console as follows :

    C:\Users\Personal>python
    Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 
    

    Try to look at ‘How to Install Python in Microsoft Windows‘ and also ‘How to Install Python in Microsoft Windows 11‘ to get the information about how to install ‘python’. Specifically, in that article context it is in a device running using Microsoft Windows.

  3. And in the Python command console, type the following for importing Pandas library :

    C:\Users\Personal>python
    Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import pandas as pd
    
    

    As a reference, just look at ‘How to Install Pandas‘ and also ‘How to Use Pandas‘ to check more info about how to use Pandas library.

  4. After successfully importing Pandas library, use it to read CSV file as in the following command execution :

    C:\Users\Personal>python
    Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import pandas as pd
    >>> df_nba = pd.read_csv('nba.csv')
    
    

    Make sure that the ‘nba.csv’ file exist in the same root folder as the current working directory where the python command is being executed.

  5. Given the existence of the DataFrame variable, execute describe() and info() method to get know more about the DataFrame variable as follows :

    >>> df_nba.describe()
    
              Number        Age     Weight       Salary
    count 457.000000 457.000000 457.000000 4.460000e+02
    mean   17.678337  26.938731 221.522976 4.842684e+06
    std    15.966090   4.404016  26.368343 5.229238e+06
    min     0.000000  19.000000 161.000000 3.088800e+04
    25%     5.000000  24.000000 200.000000 1.044792e+06
    50%    13.000000  26.000000 220.000000 2.839073e+06
    75%    25.000000  30.000000 240.000000 6.500000e+06
    max    99.000000  40.000000 307.000000 2.500000e+07
    >>> df_nba.info()
    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 458 entries, 0 to 457
    Data columns (total 9 columns):
    #   Column   Non-Null Count Dtype
    --- ------   -------------- -----
    0   Name     457 non-null   object
    1   Team     457 non-null   object
    2   Number   457 non-null   float64
    3   Position 457 non-null   object
    4   Age      457 non-null   float64
    5   Height   457 non-null   object
    6   Weight   457 non-null   float64
    7   College  373 non-null   object
    8   Salary   446 non-null   float64
    dtypes: float64(4), object(5)
    memory usage: 32.3+ KB
    >>>
    
  6. Finally, after retrieving all of the information about the DataFrame variable, execute the following command below. Especially the name of all the available columns, just execute the following command to access or to index the DataFrame to retrieve multiple elements.

    >>> print(df_nba[['Name','Team','College']]);
                 Name           Team           College
    0   Avery Bradley Boston Celtics             Texas
    1     Jae Crowder Boston Celtics         Marquette
    2    John Holland Boston Celtics Boston University
    3     R.J. Hunter Boston Celtics     Georgia State
    4   Jonas Jerebko Boston Celtics               NaN
    .. ... ... ...
    453  Shelvin Mack      Utah Jazz            Butler
    454  Raul Neto         Utah Jazz               NaN
    455  Tibor Pleiss      Utah Jazz               NaN
    456  Jeff Withey       Utah Jazz            Kansas
    457  NaN NaN NaN
    
    [458 rows x 3 columns]
    >>>
    

    So, instead of using single bracket with only one string value as the key to retrieve only one column, just use the bracket [[]] form to state that there are more columns to be retrieved. In the above example, it is trying to retrieve the ‘Name’, ‘Team’ and also the ‘College’ column from the DataFrame variable.

Leave a Reply