Introduction
This article will show how to write data available in a DataFrame to a CSV file using Pandas library running in a script in Jupyter Notebook. So, there are several things to do before writing the data available in a DataFrame. Off course the first step is to run the jupyter notebook. Since the execution will be available in the jupyter notebook. The following is the execution of it :
(myenv) C:\python\data-science>jupyter notebook [I 17:08:28.062 NotebookApp] Serving notebooks from local directory: C:\python\data-science [I 17:08:28.062 NotebookApp] The Jupyter Notebook is running at: [I 17:08:28.067 NotebookApp] http://localhost:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b [I 17:08:28.068 NotebookApp] or http://127.0.0.1:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b [I 17:08:28.070 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). [C 17:08:28.225 NotebookApp] To access the notebook, open this file in a browser: file:///C:/Users/Personal/AppData/Roaming/jupyter/runtime/nbserver-8796-open.html Or copy and paste one of these URLs: http://localhost:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b or http://127.0.0.1:8888/?token=4dd9801ef2aacad1d445955b0ae4621b4c669da84c617e7b
Generate CSV file
After executing the jupyter notebook. Access it and execute the script. Actually, the data available in the dataframe can be available from many sources. For an example, retrieving data to a dataframe from a database connection. Let’s say that the process of getting the data is over. For further reference on getting data from a database, just access the page listing several articles on it in this link. There is an article showing how to get data from a database. The article is How to Get Data From a PostgreSQL Database in Jupyter Notebook . So, after the data is available in a database, in order to persist or to save it in a CSV file, just execute the following command pattern :
df.to_csv(r'Path where you want to store the exported CSV file\File Name.csv', index = False)
Where in the above example, df is a DataFrame variable where the data exist. And ‘to_csv’ is a function available in DataFrame variable to pass the data into a CSV file. For an example, the following is the execution of the above command pattern :
The script on the top is available in the article How to Get Data From a PostgreSQL Database in Jupyter Notebook. After successfully execute the command, the next script will save the data available in the data variable into a CSV file. The script is in the following also according to the pattern above :
data.to_csv("file_csv_example.csv")
Not long after that, a new file will be available in the same folder with the script. It is exist as in the following image :
The script example in this article is ‘postgresql-db-connect-get-data.ipynb’. And the new file which is an output of the above script execution for generating CSV file is ‘file_csv_example.csv’.